From 9867282c0e5b1b78e8a5aced05b171b566df48ed Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Thu, 21 May 2026 11:05:52 -0700 Subject: [PATCH 01/13] Replace developer-framing Key concepts across 31 notebooks Replay of ef4dba70 + aa4ab2e8 + 055042e8 on fresh main, accounting for PR #509 file renames. Updates the developer-framing Key concepts block to the new record/model/document terminology across 30 Jupyter notebooks and 1 R notebook. Includes the cross-framing swap for qualitative_text_generation.ipynb and the structural restructure (with preserved architecture images) for use_dataset_model_objects.ipynb. TOC anchors preserved where present. Co-authored-by: Cursor --- notebooks/code_sharing/r/r_custom_tests.Rmd | 32 +- .../configure_dataset_features.ipynb | 956 +- .../load_datasets_predictions.ipynb | 2136 ++-- .../use_dataset_model_objects.ipynb | 1992 ++-- .../metrics/log_metrics_over_time.ipynb | 1940 ++-- .../qualitative_text_generation.ipynb | 1928 ++-- .../custom_tests/implement_custom_tests.ipynb | 2216 +++-- .../explore_tests/explore_test_suites.ipynb | 1839 ++-- .../tests/explore_tests/explore_tests.ipynb | 8860 ++++++++--------- .../run_tests/1-run_dataset-based_tests.ipynb | 1564 +-- .../run_tests/2-run_comparison_tests.ipynb | 2228 ++--- .../enable_pii_detection.ipynb | 1332 +-- ...tests_that_require_multiple_datasets.ipynb | 1166 +-- ...t_multiple_results_for_the_same_test.ipynb | 1270 +-- .../run_documentation_sections.ipynb | 1204 +-- .../run_documentation_tests_with_config.ipynb | 1470 +-- .../quickstart/quickstart_documentation.ipynb | 1852 ++-- .../_about-validmind-developers.ipynb | 162 +- .../development/1-set_up_validmind.ipynb | 952 +- .../agents/document_agentic_ai.ipynb | 4380 ++++---- .../quickstart_option_pricing_models.ipynb | 4220 ++++---- ...start_option_pricing_models_quantlib.ipynb | 2710 ++--- .../quickstart_code_explainer_demo.ipynb | 1766 ++-- .../application_scorecard_executive.ipynb | 782 +- .../application_scorecard_full_suite.ipynb | 1834 ++-- .../application_scorecard_with_bias.ipynb | 3124 +++--- .../application_scorecard_with_ml.ipynb | 4024 ++++---- ...document_excel_application_scorecard.ipynb | 2034 ++-- .../nlp_and_llm/prompt_validation_demo.ipynb | 1122 +-- .../quickstart_time_series_full_suite.ipynb | 1522 +-- .../quickstart_time_series_high_code.ipynb | 2038 ++-- 31 files changed, 32396 insertions(+), 32259 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index 4426e1ac9..cb09c28f9 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -41,28 +41,34 @@ Signing up is FREE — diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index 7abcc885d..ede3bdfb7 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -1,478 +1,484 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Configure dataset features\n", - "\n", - "When initializing a ValidMind dataset object, you can pass in a list of features to use instead of utilizing all dataset columns when running tests.\n", - "\n", - "This notebook shows how to use custom feature columns with `init_dataset`. The default behavior of `init_dataset` is to utilize all dataset columns when running tests. It is also possible to pass in a list of features to use and thus restrict computations to only those features." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Configure dataset features\n", + "\n", + "When initializing a ValidMind dataset object, you can pass in a list of features to use instead of utilizing all dataset columns when running tests.\n", + "\n", + "This notebook shows how to use custom feature columns with `init_dataset`. The default behavior of `init_dataset` is to utilize all dataset columns when running tests. It is also possible to pass in a list of features to use and thus restrict computations to only those features." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Initialize the training and test datasets](#toc3_1__) \n", + " - [Defining custom features](#toc3_2__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your model documentation](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the sample dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "# You can also try a different dataset with:\n", + "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the training and test datasets\n", + "\n", + "Before you can run a test suite, which are just a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to analyze\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — the name of the target column in the dataset\n", + "- `feature_columns` - the names of the feature columns in the dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + "]\n", + "\n", + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Defining custom features\n", + "\n", + "This section shows how we can define a subset of features to use when running dataset tests. Any feature that is not included in the `feature_columns` argument is omitted from the computation of the `DescriptiveStatistics` test in the examples below." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the following example we use the `DescriptiveStatistics` test to show how the output changes when customizing features." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Running a test with all the features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset_all_features\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Running a test with a subset of features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset_subset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=[\"CreditScore\", \"Age\", \"Balance\", \"Geography\"],\n", + ")\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-32870f8bce7f4ed0903136a69d02b421" + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Initialize the training and test datasets](#toc3_1__) \n", - " - [Defining custom features](#toc3_2__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your model documentation](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the sample dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "# You can also try a different dataset with:\n", - "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the training and test datasets\n", - "\n", - "Before you can run a test suite, which are just a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to analyze\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — the name of the target column in the dataset\n", - "- `feature_columns` - the names of the feature columns in the dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - "]\n", - "\n", - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Defining custom features\n", - "\n", - "This section shows how we can define a subset of features to use when running dataset tests. Any feature that is not included in the `feature_columns` argument is omitted from the computation of the `DescriptiveStatistics` test in the examples below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the following example we use the `DescriptiveStatistics` test to show how the output changes when customizing features." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Running a test with all the features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset_all_features\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Running a test with a subset of features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset_subset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=[\"CreditScore\", \"Age\", \"Balance\", \"Geography\"],\n", - ")\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-32870f8bce7f4ed0903136a69d02b421", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index 222c98431..02f339f24 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -1,1067 +1,1073 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Load dataset predictions\n", - "\n", - "To enable tests to make use of predictions, you can load predictions in ValidMind dataset objects in multiple different ways.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, and initialize ValidMind objects. Additionally, it offers options for loading predictions using the `assign_predictions()` function, such as loading predictions from a file, linking an existing prediction column in the dataset with a model, or allowing the ValidMind Library to run and link predictions to a model." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Prepocess the raw dataset](#toc4__) \n", - "- [Train models for testing](#toc5__) \n", - "- [Initialize ValidMind objects](#toc6__) \n", - " - [Initialize the ValidMind models](#toc6_1__) \n", - " - [Initialize the ValidMind datasets](#toc6_2__) \n", - "- [Options to load predictions using the ValidMind Library](#toc7__) \n", - " - [Load predictions from a file](#toc7_1__) \n", - " - [Predictions calculated outside of VM](#toc7_2__) \n", - " - [Assign predictions to the training dataset](#toc7_3__) \n", - " - [Run an example test](#toc7_4__) \n", - " - [Link an existing prediction column in the dataset with a model](#toc7_5__) \n", - " - [Link prediction column to a specific model](#toc7_5_1__) \n", - " - [Link an existing prediction column in the dataset with a model](#toc7_6__) \n", - " - [Pass `` in dataset interface](#toc7_6_1__) \n", - " - [Through `assign_predictions` interface](#toc7_6_2__) \n", - " - [Run an example test](#toc7_7__) \n", - " - [Using `predict_fn` to store multiple columns](#toc7_8__) \n", - " - [Create enhanced predict function](#toc7_8_1__) \n", - " - [Initialize model with predict function](#toc7_8_2__) \n", - " - [Assign predictions with multiple columns](#toc7_8_3__) \n", - " - [Verify multiple columns in dataset](#toc7_8_4__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Train models for testing\n", - "\n", - "- Initialize XGBoost and Logistic Regression Classifiers" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "import xgboost\n", - "\n", - "%matplotlib inline\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")\n", - "\n", - "lr = LogisticRegression(random_state=0)\n", - "lr.fit(\n", - " x_train,\n", - " y_train,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "\n", - "\n", - "### Initialize the ValidMind models" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")\n", - "vm_model_lr = vm.init_model(\n", - " lr,\n", - " input_id=\"lr\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=raw_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Options to load predictions using the ValidMind Library\n", - "\n", - "\n", - "\n", - "### Load predictions from a file\n", - "\n", - "This creates a new column called `_prediction` in the dataset and assigns metadata to track that the `_prediction` column is linked to the model ``" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Predictions calculated outside of VM" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "train_xgb_prediction = pd.DataFrame(xgb.predict(x_train), columns=[\"xgb_prediction\"])\n", - "test__xgb_prediction = pd.DataFrame(xgb.predict(x_val), columns=[\"xgb_prediction\"])\n", - "\n", - "train_lr_prediction = pd.DataFrame(lr.predict(x_train), columns=[\"lr_prediction\"])\n", - "test_lr_prediction = pd.DataFrame(lr.predict(x_val), columns=[\"lr_prediction\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Assign predictions to the training dataset\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_xgb, prediction_values=train_xgb_prediction.xgb_prediction.values\n", - ")\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_lr, prediction_values=train_lr_prediction.lr_prediction.values\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run an example test\n", - "\n", - "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model_lr,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Link an existing prediction column in the dataset with a model\n", - "\n", - "This approach allows loading datasets that already have prediction columns in addition to feature and target columns. The ValidMind Library assigns metadata to track the predictions column that are linked to a given `` model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df2 = train_df.copy()\n", - "train_df2[\"xgb_prediction\"] = train_xgb_prediction.xgb_prediction.values\n", - "train_df2[\"lr_prediction\"] = train_lr_prediction.lr_prediction.values\n", - "train_df2.head(5)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df2,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Link prediction column to a specific model\n", - "\n", - "The `prediction_column` parameter informs the `Dataset` object about the model that should be linked to that column." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb, prediction_column=\"xgb_prediction\")\n", - "vm_train_ds.assign_predictions(model=vm_model_lr, prediction_column=\"lr_prediction\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "wE0OckXjSPc7" - }, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_lr},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Link an existing prediction column in the dataset with a model\n", - "\n", - "This lets the ValidMind Library run model predictions, creates a new column called `_prediction`, and assign metadata to track that the `_prediction` column is linked to the `` model.\n", - "\n", - "There are two ways run and assign model predictions with the ValidMind Library:\n", - "\n", - "- When initializing a `Dataset` with `init_dataset()`. This is the most straightforward method to assign predictions for a single model.\n", - "- Using `dataset.assign_predictions()`. This allows assigning predictions to a dataset for one or more models." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Pass `` in dataset interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " model=vm_model_xgb,\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Through `assign_predictions` interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "##### Perform predictions using the same `assign_predictions` interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_train_ds.assign_predictions(model=vm_model_lr)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run an example test\n", - "\n", - "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model_lr,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using `predict_fn` to store multiple columns\n", - "\n", - "The `predict_fn` parameter in `vm.init_model()` allows you to create models that return multiple pieces of information when making predictions. This is particularly useful when you want to capture additional metadata, confidence scores, feature importance, or any other model-related information alongside the main prediction.\n", - "\n", - "By returning a dictionary from your predict function, ValidMind automatically creates separate columns for each key when you run `assign_predictions()`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Create enhanced predict function\n", - "\n", - "Let's create a predict function that wraps our XGBoost model and returns multiple pieces of information:\n", - "- **prediction**: The main class prediction\n", - "- **prediction_proba**: The prediction probabilities for both classes\n", - "- **confidence**: The maximum probability as a confidence score\n", - "- **model_info**: Metadata about the model used" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "\n", - "def enhanced_xgb_predict_fn(input_data):\n", - " \"\"\"\n", - " Enhanced predict function that returns multiple pieces of information.\n", - " \n", - " Args:\n", - " input_data: Input features for prediction (single row as dictionary when called by ValidMind)\n", - " \n", - " Returns:\n", - " dict: Dictionary containing prediction, probabilities, confidence, and model info\n", - " \"\"\"\n", - " # Define the feature columns that the model was trained on\n", - " # These are the same columns from x_train (excluding the target column 'Exited')\n", - " training_features = [\n", - " 'CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts',\n", - " 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France',\n", - " 'Geography_Germany', 'Geography_Spain'\n", - " ]\n", - " \n", - " # Convert dictionary input to DataFrame for model prediction\n", - " # When called by ValidMind, input_data is a single row dictionary\n", - " if isinstance(input_data, dict):\n", - " # Filter to only include training features and convert to DataFrame\n", - " filtered_data = {key: value for key, value in input_data.items() if key in training_features}\n", - " input_df = pd.DataFrame([filtered_data])\n", - " \n", - " # Ensure all training features are present (in case some are missing)\n", - " for feature in training_features:\n", - " if feature not in input_df.columns:\n", - " input_df[feature] = 0 # Default value for missing features\n", - " \n", - " # Reorder columns to match training order\n", - " input_df = input_df[training_features]\n", - " else:\n", - " # Handle other input types (DataFrame, array, etc.)\n", - " input_df = pd.DataFrame(input_data) if not isinstance(input_data, pd.DataFrame) else input_data\n", - " # Filter to training features if it's a DataFrame\n", - " if isinstance(input_df, pd.DataFrame):\n", - " input_df = input_df[training_features]\n", - " \n", - " # Make predictions\n", - " prediction = xgb.predict(input_df)\n", - " prediction_proba = xgb.predict_proba(input_df)\n", - " \n", - " # Since we're processing one row at a time, extract the single values\n", - " single_prediction = prediction[0] if len(prediction) > 0 else None\n", - " single_proba = prediction_proba[0] if len(prediction_proba) > 0 else None\n", - " \n", - " # Calculate confidence as the maximum probability for this prediction\n", - " confidence = np.max(single_proba) if single_proba is not None else None\n", - " \n", - " # Create model metadata\n", - " model_info = {\n", - " \"model_type\": \"XGBClassifier\",\n", - " \"n_estimators\": xgb.n_estimators,\n", - " \"max_depth\": xgb.max_depth,\n", - " \"feature_count\": len(training_features),\n", - " \"features_used\": training_features\n", - " }\n", - " \n", - " return {\n", - " \"prediction\": single_prediction,\n", - " \"prediction_proba\": single_proba.tolist() if single_proba is not None else None,\n", - " \"confidence\": confidence,\n", - " \"model_info\": model_info\n", - " }\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Initialize model with predict function\n", - "\n", - "Now we'll create a ValidMind model using the `predict_fn` parameter. This tells ValidMind to use our enhanced function instead of the model's default `predict()` method:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize ValidMind model with the enhanced predict function\n", - "vm_model_enhanced_xgb = vm.init_model(\n", - " model=xgb,\n", - " input_id=\"enhanced_xgb\",\n", - " predict_fn=enhanced_xgb_predict_fn \n", - ")\n", - "\n", - "print(f\"Enhanced XGBoost model initialized with input_id: {vm_model_enhanced_xgb.input_id}\")\n", - "print(\"This model now uses the predict function that handles dictionary inputs correctly\")\n", - "print(\"It will return multiple columns when predictions are assigned to datasets\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Assign predictions with multiple columns\n", - "\n", - "When we use `assign_predictions()` with our enhanced model, ValidMind will automatically create separate columns for each key returned by our predict function. Let's assign predictions to our test dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a fresh dataset for this demonstration\n", - "vm_test_ds_enhanced = vm.init_dataset(\n", - " input_id=\"test_dataset_enhanced\",\n", - " dataset=test_df,\n", - " target_column=demo_dataset.target_column\n", - ")\n", - "\n", - "# This will create multiple columns based on the keys returned by our predict function\n", - "vm_test_ds_enhanced.assign_predictions(model=vm_model_enhanced_xgb)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Verify multiple columns in dataset\n", - "\n", - "Let's examine the dataset to see all the columns that were created by our enhanced predict function. Each key from the returned dictionary becomes a separate column with the model's `input_id` as a prefix:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_ds_enhanced._df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-76fcd2c215674068b812492b7c639056", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 0 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Load dataset predictions\n", + "\n", + "To enable tests to make use of predictions, you can load predictions in ValidMind dataset objects in multiple different ways.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, and initialize ValidMind objects. Additionally, it offers options for loading predictions using the `assign_predictions()` function, such as loading predictions from a file, linking an existing prediction column in the dataset with a model, or allowing the ValidMind Library to run and link predictions to a model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Prepocess the raw dataset](#toc4__) \n", + "- [Train models for testing](#toc5__) \n", + "- [Initialize ValidMind objects](#toc6__) \n", + " - [Initialize the ValidMind models](#toc6_1__) \n", + " - [Initialize the ValidMind datasets](#toc6_2__) \n", + "- [Options to load predictions using the ValidMind Library](#toc7__) \n", + " - [Load predictions from a file](#toc7_1__) \n", + " - [Predictions calculated outside of VM](#toc7_2__) \n", + " - [Assign predictions to the training dataset](#toc7_3__) \n", + " - [Run an example test](#toc7_4__) \n", + " - [Link an existing prediction column in the dataset with a model](#toc7_5__) \n", + " - [Link prediction column to a specific model](#toc7_5_1__) \n", + " - [Link an existing prediction column in the dataset with a model](#toc7_6__) \n", + " - [Pass `` in dataset interface](#toc7_6_1__) \n", + " - [Through `assign_predictions` interface](#toc7_6_2__) \n", + " - [Run an example test](#toc7_7__) \n", + " - [Using `predict_fn` to store multiple columns](#toc7_8__) \n", + " - [Create enhanced predict function](#toc7_8_1__) \n", + " - [Initialize model with predict function](#toc7_8_2__) \n", + " - [Assign predictions with multiple columns](#toc7_8_3__) \n", + " - [Verify multiple columns in dataset](#toc7_8_4__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Train models for testing\n", + "\n", + "- Initialize XGBoost and Logistic Regression Classifiers" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "import xgboost\n", + "\n", + "%matplotlib inline\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")\n", + "\n", + "lr = LogisticRegression(random_state=0)\n", + "lr.fit(\n", + " x_train,\n", + " y_train,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "\n", + "\n", + "### Initialize the ValidMind models" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")\n", + "vm_model_lr = vm.init_model(\n", + " lr,\n", + " input_id=\"lr\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=raw_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Options to load predictions using the ValidMind Library\n", + "\n", + "\n", + "\n", + "### Load predictions from a file\n", + "\n", + "This creates a new column called `_prediction` in the dataset and assigns metadata to track that the `_prediction` column is linked to the model ``" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Predictions calculated outside of VM" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "train_xgb_prediction = pd.DataFrame(xgb.predict(x_train), columns=[\"xgb_prediction\"])\n", + "test__xgb_prediction = pd.DataFrame(xgb.predict(x_val), columns=[\"xgb_prediction\"])\n", + "\n", + "train_lr_prediction = pd.DataFrame(lr.predict(x_train), columns=[\"lr_prediction\"])\n", + "test_lr_prediction = pd.DataFrame(lr.predict(x_val), columns=[\"lr_prediction\"])" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Assign predictions to the training dataset\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_xgb, prediction_values=train_xgb_prediction.xgb_prediction.values\n", + ")\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_lr, prediction_values=train_lr_prediction.lr_prediction.values\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run an example test\n", + "\n", + "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model_lr,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Link an existing prediction column in the dataset with a model\n", + "\n", + "This approach allows loading datasets that already have prediction columns in addition to feature and target columns. The ValidMind Library assigns metadata to track the predictions column that are linked to a given `` model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df2 = train_df.copy()\n", + "train_df2[\"xgb_prediction\"] = train_xgb_prediction.xgb_prediction.values\n", + "train_df2[\"lr_prediction\"] = train_lr_prediction.lr_prediction.values\n", + "train_df2.head(5)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df2,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Link prediction column to a specific model\n", + "\n", + "The `prediction_column` parameter informs the `Dataset` object about the model that should be linked to that column." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb, prediction_column=\"xgb_prediction\")\n", + "vm_train_ds.assign_predictions(model=vm_model_lr, prediction_column=\"lr_prediction\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "wE0OckXjSPc7" + }, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_lr},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Link an existing prediction column in the dataset with a model\n", + "\n", + "This lets the ValidMind Library run model predictions, creates a new column called `_prediction`, and assign metadata to track that the `_prediction` column is linked to the `` model.\n", + "\n", + "There are two ways run and assign model predictions with the ValidMind Library:\n", + "\n", + "- When initializing a `Dataset` with `init_dataset()`. This is the most straightforward method to assign predictions for a single model.\n", + "- Using `dataset.assign_predictions()`. This allows assigning predictions to a dataset for one or more models." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Pass `` in dataset interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " model=vm_model_xgb,\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Through `assign_predictions` interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Perform predictions using the same `assign_predictions` interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_train_ds.assign_predictions(model=vm_model_lr)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run an example test\n", + "\n", + "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model_lr,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using `predict_fn` to store multiple columns\n", + "\n", + "The `predict_fn` parameter in `vm.init_model()` allows you to create models that return multiple pieces of information when making predictions. This is particularly useful when you want to capture additional metadata, confidence scores, feature importance, or any other model-related information alongside the main prediction.\n", + "\n", + "By returning a dictionary from your predict function, ValidMind automatically creates separate columns for each key when you run `assign_predictions()`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Create enhanced predict function\n", + "\n", + "Let's create a predict function that wraps our XGBoost model and returns multiple pieces of information:\n", + "- **prediction**: The main class prediction\n", + "- **prediction_proba**: The prediction probabilities for both classes\n", + "- **confidence**: The maximum probability as a confidence score\n", + "- **model_info**: Metadata about the model used" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "def enhanced_xgb_predict_fn(input_data):\n", + " \"\"\"\n", + " Enhanced predict function that returns multiple pieces of information.\n", + " \n", + " Args:\n", + " input_data: Input features for prediction (single row as dictionary when called by ValidMind)\n", + " \n", + " Returns:\n", + " dict: Dictionary containing prediction, probabilities, confidence, and model info\n", + " \"\"\"\n", + " # Define the feature columns that the model was trained on\n", + " # These are the same columns from x_train (excluding the target column 'Exited')\n", + " training_features = [\n", + " 'CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts',\n", + " 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France',\n", + " 'Geography_Germany', 'Geography_Spain'\n", + " ]\n", + " \n", + " # Convert dictionary input to DataFrame for model prediction\n", + " # When called by ValidMind, input_data is a single row dictionary\n", + " if isinstance(input_data, dict):\n", + " # Filter to only include training features and convert to DataFrame\n", + " filtered_data = {key: value for key, value in input_data.items() if key in training_features}\n", + " input_df = pd.DataFrame([filtered_data])\n", + " \n", + " # Ensure all training features are present (in case some are missing)\n", + " for feature in training_features:\n", + " if feature not in input_df.columns:\n", + " input_df[feature] = 0 # Default value for missing features\n", + " \n", + " # Reorder columns to match training order\n", + " input_df = input_df[training_features]\n", + " else:\n", + " # Handle other input types (DataFrame, array, etc.)\n", + " input_df = pd.DataFrame(input_data) if not isinstance(input_data, pd.DataFrame) else input_data\n", + " # Filter to training features if it's a DataFrame\n", + " if isinstance(input_df, pd.DataFrame):\n", + " input_df = input_df[training_features]\n", + " \n", + " # Make predictions\n", + " prediction = xgb.predict(input_df)\n", + " prediction_proba = xgb.predict_proba(input_df)\n", + " \n", + " # Since we're processing one row at a time, extract the single values\n", + " single_prediction = prediction[0] if len(prediction) > 0 else None\n", + " single_proba = prediction_proba[0] if len(prediction_proba) > 0 else None\n", + " \n", + " # Calculate confidence as the maximum probability for this prediction\n", + " confidence = np.max(single_proba) if single_proba is not None else None\n", + " \n", + " # Create model metadata\n", + " model_info = {\n", + " \"model_type\": \"XGBClassifier\",\n", + " \"n_estimators\": xgb.n_estimators,\n", + " \"max_depth\": xgb.max_depth,\n", + " \"feature_count\": len(training_features),\n", + " \"features_used\": training_features\n", + " }\n", + " \n", + " return {\n", + " \"prediction\": single_prediction,\n", + " \"prediction_proba\": single_proba.tolist() if single_proba is not None else None,\n", + " \"confidence\": confidence,\n", + " \"model_info\": model_info\n", + " }\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Initialize model with predict function\n", + "\n", + "Now we'll create a ValidMind model using the `predict_fn` parameter. This tells ValidMind to use our enhanced function instead of the model's default `predict()` method:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize ValidMind model with the enhanced predict function\n", + "vm_model_enhanced_xgb = vm.init_model(\n", + " model=xgb,\n", + " input_id=\"enhanced_xgb\",\n", + " predict_fn=enhanced_xgb_predict_fn \n", + ")\n", + "\n", + "print(f\"Enhanced XGBoost model initialized with input_id: {vm_model_enhanced_xgb.input_id}\")\n", + "print(\"This model now uses the predict function that handles dictionary inputs correctly\")\n", + "print(\"It will return multiple columns when predictions are assigned to datasets\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Assign predictions with multiple columns\n", + "\n", + "When we use `assign_predictions()` with our enhanced model, ValidMind will automatically create separate columns for each key returned by our predict function. Let's assign predictions to our test dataset:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Create a fresh dataset for this demonstration\n", + "vm_test_ds_enhanced = vm.init_dataset(\n", + " input_id=\"test_dataset_enhanced\",\n", + " dataset=test_df,\n", + " target_column=demo_dataset.target_column\n", + ")\n", + "\n", + "# This will create multiple columns based on the keys returned by our predict function\n", + "vm_test_ds_enhanced.assign_predictions(model=vm_model_enhanced_xgb)\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Verify multiple columns in dataset\n", + "\n", + "Let's examine the dataset to see all the columns that were created by our enhanced predict function. Each key from the returned dictionary becomes a separate column with the model's `input_id` as a prefix:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_ds_enhanced._df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-76fcd2c215674068b812492b7c639056" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 0 } diff --git a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb index f727d405d..190f8a50a 100644 --- a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb +++ b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb @@ -1,997 +1,999 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Introduction to ValidMind Dataset and Model Objects\n", - "\n", - "When writing custom tests, it is essential to be aware of the interfaces of the ValidMind Dataset and ValidMind Model, which are used as input arguments.\n", - "\n", - "As a model developer, writing custom tests is beneficial when the ValidMind library lacks a built-in test for your specific needs. For example, a model might require new tests to evaluate specific aspects of the model or dataset based on a particular use case.\n", - "\n", - "This interactive notebook offers a detailed understanding of ValidMind objects and their use in writing custom tests. It introduces various interfaces provided by these objects and demonstrates how they can be leveraged to implement tests effortlessly." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Load the demo dataset](#toc3__) \n", - " - [Prepocess the raw dataset](#toc3_1__) \n", - "- [Train a model for testing](#toc4__) \n", - "- [Explore basic components of the ValidMind library](#toc5__) \n", - " - [VMDataset Object](#toc5_1__) \n", - " - [Initialize the ValidMind datasets](#toc5_1_1__) \n", - " - [ Interfaces of the dataset object](#toc5_1_2__) \n", - " - [Using VM Dataset object as arguments in custom tests](#toc5_2__) \n", - " - [Run the test](#toc5_2_1__) \n", - " - [Using VM Dataset object and parameters as arguments in custom tests](#toc5_3__) \n", - " - [VMModel Object](#toc5_4__) \n", - " - [Initialize ValidMind model object](#toc5_5__) \n", - " - [Assign predictions to the datasets](#toc5_6__) \n", - " - [Using VM Model and Dataset objects as arguments in Custom tests](#toc5_7__) \n", - " - [Log the test results](#toc5_8__) \n", - "- [In summary](#toc6__) \n", - "- [Discover more learning resources](#toc7__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "Here, we will focus on ValidMind dataset, ValidMind model and tests to use these objects to generate artefacts for the documentation.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single ValidMind model object that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single ValidMind dataset object that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Dataset based Test**\n", - "\n", - "![Dataset based test architecture](./dataset_image.png)\n", - "The dataset based tests take VM dataset object(s) as inputs, test configuration as test parameters to produce `Outputs` as mentioned above.\n", - "\n", - "**Model based Test**\n", - "\n", - "![Model based test architecture](./model_image.png)\n", - "Similar to datasest based tests, the model based tests as an additional input that is VM model object. It allows to identify prediction values of a specific model in the dataset object. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "Please note the following recommended Python versions to use:\n", - "\n", - "- Python 3.7 > x <= 3.11\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "metadata": {} - }, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the demo dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "raw_df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Prepocess the raw dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Explore basic components of the ValidMind library\n", - "\n", - "In this section, you will learn about the basic objects of the ValidMind library that are necessary to implement both custom and built-in tests. As explained above, these objects are:\n", - "* VMDataset: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset)\n", - "* VMModel: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMModel)\n", - "\n", - "Let's understand these objects and their interfaces step by step: " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### VMDataset Object" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Initialize the ValidMind datasets\n", - "\n", - "You can initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "The function wraps the dataset to create a ValidMind `Dataset` object so that you can write tests effectively using the common interface provided by the VM objects. This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind. You only need to do it one time per dataset.\n", - "\n", - "This function takes a number of arguments. Some of the arguments are:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# vm_raw_dataset is now a VMDataset object that you can pass to any ValidMind test\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=\"Exited\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you have a ValidMind dataset object (VMDataset), you can inspect its attributes and methods using the inspect_obj utility module. This module provides a list of available attributes and interfaces for use in tests. Understanding how to use VMDatasets is crucial for comprehending how a custom test functions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import inspect_obj\n", - "inspect_obj(vm_raw_dataset)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Interfaces of the dataset object\n", - "\n", - "**DataFrame**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Feature columns**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Target column**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.target_column" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Features values**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.x_df()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Target value**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.y_df()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Numeric feature columns** " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns_numeric" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Categorical feature columns** " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns_categorical" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Similarly, you can use all other interfaces of the [VMDataset objects](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Dataset object as arguments in custom tests\n", - "\n", - "A custom test is simply a Python function that takes two types of arguments: `inputs` and `params`. The `inputs` are ValidMind objects (`VMDataset`, `VMModel`), and the `params` are additional parameters required for the underlying computation of the test. We will discuss both types of arguments in the following sections.\n", - "\n", - "Let's start with a custom test that requires only a ValidMind dataset object. In this example, we will check the balance of classes in the target column of the dataset:\n", - "\n", - "- The custom test below requires a single argument of type `VMDataset` (dataset).\n", - "- The `my_custom_tests.ClassImbalance` is a unique test identifier that can be assigned using the `vm.test` decorator functionality. This unique test ID will be used in the platform to load test results in the documentation.\n", - "- The `dataset.target_column` and `dataset.df` attributes of the `VMDataset` object are used in the test.\n", - "\n", - "Other high-level APIs (attributes and methods) of the dataset object are listed [here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset).\n", - "\n", - "If you've gone through the [Implement custom tests notebook](../tests/custom_tests/implement_custom_tests.ipynb), you should have a good understanding of how custom tests are implemented in details. If you haven't, we recommend going through that notebook first." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.vm_models.dataset.dataset import VMDataset\n", - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.ClassImbalance\")\n", - "def class_imbalance(dataset):\n", - " # Can only run this test if we have a Dataset object\n", - " if not isinstance(dataset, VMDataset):\n", - " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", - "\n", - " if dataset.target_column is None:\n", - " print(\"Skipping class_imbalance test because no target column is defined\")\n", - " return\n", - "\n", - " # VMDataset object provides target_column attribute\n", - " target_column = dataset.target_column\n", - " # we can access pandas DataFrame using df attribute\n", - " imbalance_percentages = dataset.df[target_column].value_counts(\n", - " normalize=True\n", - " )\n", - " classes = list(imbalance_percentages.index) \n", - " percentages = list(imbalance_percentages.values * 100)\n", - "\n", - " return pd.DataFrame({\"Classes\":classes, \"Percentage\": percentages})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Run the test\n", - "\n", - "Let's run the test using the `run_test` method, which is part of the `validmind.tests` module. Here, we pass the `dataset` through the `inputs`. Similarly, you can pass `datasets`, `model`, or `models` as inputs if your custom test requires them. In this example below, we run the custom test `my_custom_tests.ClassImbalance` by passing the `dataset` through the `inputs`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id=\"my_custom_tests.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can move custom tests into separate modules in a folder. It allows you to take one-off tests and move them into an organized structure that makes it easier to manage, maintain and share them. We have provided a seperate notebook with detailed explaination [here](../tests/custom_tests/integrate_external_test_providers.ipynb) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Dataset object and parameters as arguments in custom tests\n", - "\n", - "Simlilar to `inputs`, you can pass `params` to a custom test by providing a dictionary of parameters to the `run_test()` function. The parameters will override any default parameters set in the custom test definition. Note that the `dataset` is still passed as `inputs`. \n", - "Let's modify the class imbalance test so that it provides flexibility to `normalize` the results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.vm_models.dataset.dataset import VMDataset\n", - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.ClassImbalance\")\n", - "def class_imbalance(dataset, normalize=True):\n", - " # Can only run this test if we have a Dataset object\n", - " if not isinstance(dataset, VMDataset):\n", - " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", - "\n", - " if dataset.target_column is None:\n", - " print(\"Skipping class_imbalance test because no target column is defined\")\n", - " return\n", - "\n", - " # VMDataset object provides target_column attribute\n", - " target_column = dataset.target_column\n", - " # we can access pandas DataFrame using df attribute\n", - " imbalance_percentages = dataset.df[target_column].value_counts(\n", - " normalize=normalize\n", - " )\n", - " classes = list(imbalance_percentages.index) \n", - " if normalize: \n", - " result = pd.DataFrame({\"Classes\":classes, \"Percentage\": list(imbalance_percentages.values*100)})\n", - " else:\n", - " result = pd.DataFrame({\"Classes\":classes, \"Count\": list(imbalance_percentages.values)})\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this example, the `normalize` parameter is set to `False`, so the class counts will not be normalized. You can change the value to `True` if you want the counts to be normalized. The results of the test will reflect this flexibility, allowing for different outputs based on the parameter passed.\n", - "\n", - "Here, we have passed the `dataset` through the `inputs` and the `normalize` parameter using the `params`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id = \"my_custom_tests.ClassImbalance\",\n", - " inputs={\"dataset\": vm_raw_dataset},\n", - " params={\"normalize\": True},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### VMModel Object" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize ValidMind model object\n", - "\n", - "Similar to ValidMind `Dataset` object, you can initialize a ValidMind Model object using the [`init_model`](https://docs.validmind.ai/validmind/validmind.html#init_model) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments. Some of the arguments are:\n", - "\n", - "- `model` — the raw model that you want evaluate\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "\n", - "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_model) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "vm_model = vm.init_model(\n", - " model=model,\n", - " input_id=\"xgb_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's inspect the methods and attributes of the model now:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "inspect_obj(vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can see below, the extra prediction column (`xgb_model_prediction`) for the model (`xgb_model`) has been added in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(vm_train_ds)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Model and Dataset objects as arguments in Custom tests\n", - "\n", - "We will now create a `@vm.test` wrapper that will allow you to create a reusable test. Note the following changes in the code below:\n", - "\n", - "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a `VMDataset` and `VMModel` object respectively.\n", - " - `VMDataset` objects allow you to access the dataset's true (target) values by accessing the `.y` attribute.\n", - " - `VMDataset` objects allow you to access the predictions for a given record (model) by accessing the `.y_pred()` method.\n", - "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", - "- The function body calculates the confusion matrix using the `sklearn.tests.confusion_matrix` function as we just did above.\n", - "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", - "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)\n", - "\n", - "Similarly, you can use the functinality provided by `VMDataset` and `VMModel` objects. You can refer our documentation page for all the avalialble APIs [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn import metrics\n", - "import matplotlib.pyplot as plt\n", - "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", - "def confusion_matrix(dataset, model):\n", - " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", - "\n", - " The confusion matrix is a 2x2 table that contains 4 values:\n", - "\n", - " - True Positive (TP): the number of correct positive predictions\n", - " - True Negative (TN): the number of correct negative predictions\n", - " - False Positive (FP): the number of incorrect positive predictions\n", - " - False Negative (FN): the number of incorrect negative predictions\n", - "\n", - " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", - " \"\"\"\n", - " # we can retrieve traget value from dataset which is y attribute\n", - " y_true = dataset.y\n", - " # The prediction value of a specific model using y_pred method \n", - " y_pred = dataset.y_pred(model=model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - " plt.close()\n", - "\n", - " return cm_display.figure_ # return the figure object itself" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, we run test using two inputs; `dataset` and `model`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id = \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log the test results\n", - "\n", - "You can log any test result to the ValidMind Platform with the `.log()` method of the result object. This will allow you to add the result to the documentation.\n", - "\n", - "You can now do the same for the confusion matrix results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this notebook you have learned the end-to-end process to document a model with the ValidMind Library, running through some very common scenarios in a typical model development setting:\n", - "\n", - "- Running out-of-the-box tests\n", - "- Documenting your model by adding evidence to model documentation\n", - "- Extending the capabilities of the ValidMind Library by implementing custom tests\n", - "- Ensuring that the documentation is complete by running all tests in the documentation template" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-9be1890525a54c10be782f80fe33833f", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction to ValidMind Dataset and Model Objects\n", + "\n", + "When writing custom tests, it is essential to be aware of the interfaces of the ValidMind Dataset and ValidMind Model, which are used as input arguments.\n", + "\n", + "As a model developer, writing custom tests is beneficial when the ValidMind library lacks a built-in test for your specific needs. For example, a model might require new tests to evaluate specific aspects of the model or dataset based on a particular use case.\n", + "\n", + "This interactive notebook offers a detailed understanding of ValidMind objects and their use in writing custom tests. It introduces various interfaces provided by these objects and demonstrates how they can be leveraged to implement tests effortlessly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Load the demo dataset](#toc3__) \n", + " - [Prepocess the raw dataset](#toc3_1__) \n", + "- [Train a model for testing](#toc4__) \n", + "- [Explore basic components of the ValidMind library](#toc5__) \n", + " - [VMDataset Object](#toc5_1__) \n", + " - [Initialize the ValidMind datasets](#toc5_1_1__) \n", + " - [ Interfaces of the dataset object](#toc5_1_2__) \n", + " - [Using VM Dataset object as arguments in custom tests](#toc5_2__) \n", + " - [Run the test](#toc5_2_1__) \n", + " - [Using VM Dataset object and parameters as arguments in custom tests](#toc5_3__) \n", + " - [VMModel Object](#toc5_4__) \n", + " - [Initialize ValidMind model object](#toc5_5__) \n", + " - [Assign predictions to the datasets](#toc5_6__) \n", + " - [Using VM Model and Dataset objects as arguments in Custom tests](#toc5_7__) \n", + " - [Log the test results](#toc5_8__) \n", + "- [In summary](#toc6__) \n", + "- [Discover more learning resources](#toc7__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + " - **dataset-based test**\n", + "\n", + " ![Dataset based test architecture](./dataset_image.png)\n", + " Dataset-based tests take VM dataset objects as inputs, can be configured with values passed in as parameters, and return outputs such as tables, plots, or images.\n", + "\n", + " - **model-based test**:\n", + "\n", + " ![Model based test architecture](./model_image.png)\n", + " Similar to dataset-based tests, model-based tests take additional VM model objects as inputs alongside VM dataset objects. The VM model object can wrap any type of record and is used to obtain prediction values for entries in the dataset.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [test_suites](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "Please note the following recommended Python versions to use:\n", + "\n", + "- Python 3.7 > x <= 3.11\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": { + "metadata": {} + }, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the demo dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "raw_df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Prepocess the raw dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Explore basic components of the ValidMind library\n", + "\n", + "In this section, you will learn about the basic objects of the ValidMind library that are necessary to implement both custom and built-in tests. As explained above, these objects are:\n", + "* VMDataset: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset)\n", + "* VMModel: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMModel)\n", + "\n", + "Let's understand these objects and their interfaces step by step: " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### VMDataset Object" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Initialize the ValidMind datasets\n", + "\n", + "You can initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "The function wraps the dataset to create a ValidMind `Dataset` object so that you can write tests effectively using the common interface provided by the VM objects. This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind. You only need to do it one time per dataset.\n", + "\n", + "This function takes a number of arguments. Some of the arguments are:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset) " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# vm_raw_dataset is now a VMDataset object that you can pass to any ValidMind test\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=\"Exited\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once you have a ValidMind dataset object (VMDataset), you can inspect its attributes and methods using the inspect_obj utility module. This module provides a list of available attributes and interfaces for use in tests. Understanding how to use VMDatasets is crucial for comprehending how a custom test functions." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import inspect_obj\n", + "inspect_obj(vm_raw_dataset)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Interfaces of the dataset object\n", + "\n", + "**DataFrame**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.df" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Feature columns**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Target column**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.target_column" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Features values**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.x_df()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Target value**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.y_df()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Numeric feature columns** " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns_numeric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Categorical feature columns** " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns_categorical" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Similarly, you can use all other interfaces of the [VMDataset objects](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Dataset object as arguments in custom tests\n", + "\n", + "A custom test is simply a Python function that takes two types of arguments: `inputs` and `params`. The `inputs` are ValidMind objects (`VMDataset`, `VMModel`), and the `params` are additional parameters required for the underlying computation of the test. We will discuss both types of arguments in the following sections.\n", + "\n", + "Let's start with a custom test that requires only a ValidMind dataset object. In this example, we will check the balance of classes in the target column of the dataset:\n", + "\n", + "- The custom test below requires a single argument of type `VMDataset` (dataset).\n", + "- The `my_custom_tests.ClassImbalance` is a unique test identifier that can be assigned using the `vm.test` decorator functionality. This unique test ID will be used in the platform to load test results in the documentation.\n", + "- The `dataset.target_column` and `dataset.df` attributes of the `VMDataset` object are used in the test.\n", + "\n", + "Other high-level APIs (attributes and methods) of the dataset object are listed [here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset).\n", + "\n", + "If you've gone through the [Implement custom tests notebook](../tests/custom_tests/implement_custom_tests.ipynb), you should have a good understanding of how custom tests are implemented in details. If you haven't, we recommend going through that notebook first." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.vm_models.dataset.dataset import VMDataset\n", + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.ClassImbalance\")\n", + "def class_imbalance(dataset):\n", + " # Can only run this test if we have a Dataset object\n", + " if not isinstance(dataset, VMDataset):\n", + " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", + "\n", + " if dataset.target_column is None:\n", + " print(\"Skipping class_imbalance test because no target column is defined\")\n", + " return\n", + "\n", + " # VMDataset object provides target_column attribute\n", + " target_column = dataset.target_column\n", + " # we can access pandas DataFrame using df attribute\n", + " imbalance_percentages = dataset.df[target_column].value_counts(\n", + " normalize=True\n", + " )\n", + " classes = list(imbalance_percentages.index) \n", + " percentages = list(imbalance_percentages.values * 100)\n", + "\n", + " return pd.DataFrame({\"Classes\":classes, \"Percentage\": percentages})" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Run the test\n", + "\n", + "Let's run the test using the `run_test` method, which is part of the `validmind.tests` module. Here, we pass the `dataset` through the `inputs`. Similarly, you can pass `datasets`, `model`, or `models` as inputs if your custom test requires them. In this example below, we run the custom test `my_custom_tests.ClassImbalance` by passing the `dataset` through the `inputs`. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id=\"my_custom_tests.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can move custom tests into separate modules in a folder. It allows you to take one-off tests and move them into an organized structure that makes it easier to manage, maintain and share them. We have provided a seperate notebook with detailed explaination [here](../tests/custom_tests/integrate_external_test_providers.ipynb) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Dataset object and parameters as arguments in custom tests\n", + "\n", + "Simlilar to `inputs`, you can pass `params` to a custom test by providing a dictionary of parameters to the `run_test()` function. The parameters will override any default parameters set in the custom test definition. Note that the `dataset` is still passed as `inputs`. \n", + "Let's modify the class imbalance test so that it provides flexibility to `normalize` the results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.vm_models.dataset.dataset import VMDataset\n", + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.ClassImbalance\")\n", + "def class_imbalance(dataset, normalize=True):\n", + " # Can only run this test if we have a Dataset object\n", + " if not isinstance(dataset, VMDataset):\n", + " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", + "\n", + " if dataset.target_column is None:\n", + " print(\"Skipping class_imbalance test because no target column is defined\")\n", + " return\n", + "\n", + " # VMDataset object provides target_column attribute\n", + " target_column = dataset.target_column\n", + " # we can access pandas DataFrame using df attribute\n", + " imbalance_percentages = dataset.df[target_column].value_counts(\n", + " normalize=normalize\n", + " )\n", + " classes = list(imbalance_percentages.index) \n", + " if normalize: \n", + " result = pd.DataFrame({\"Classes\":classes, \"Percentage\": list(imbalance_percentages.values*100)})\n", + " else:\n", + " result = pd.DataFrame({\"Classes\":classes, \"Count\": list(imbalance_percentages.values)})\n", + " return result" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the `normalize` parameter is set to `False`, so the class counts will not be normalized. You can change the value to `True` if you want the counts to be normalized. The results of the test will reflect this flexibility, allowing for different outputs based on the parameter passed.\n", + "\n", + "Here, we have passed the `dataset` through the `inputs` and the `normalize` parameter using the `params`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id = \"my_custom_tests.ClassImbalance\",\n", + " inputs={\"dataset\": vm_raw_dataset},\n", + " params={\"normalize\": True},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### VMModel Object" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize ValidMind model object\n", + "\n", + "Similar to ValidMind `Dataset` object, you can initialize a ValidMind Model object using the [`init_model`](https://docs.validmind.ai/validmind/validmind.html#init_model) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments. Some of the arguments are:\n", + "\n", + "- `model` — the raw model that you want evaluate\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "\n", + "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_model) " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "vm_model = vm.init_model(\n", + " model=model,\n", + " input_id=\"xgb_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's inspect the methods and attributes of the model now:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "inspect_obj(vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see below, the extra prediction column (`xgb_model_prediction`) for the model (`xgb_model`) has been added in the dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(vm_train_ds)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Model and Dataset objects as arguments in Custom tests\n", + "\n", + "We will now create a `@vm.test` wrapper that will allow you to create a reusable test. Note the following changes in the code below:\n", + "\n", + "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a `VMDataset` and `VMModel` object respectively.\n", + " - `VMDataset` objects allow you to access the dataset's true (target) values by accessing the `.y` attribute.\n", + " - `VMDataset` objects allow you to access the predictions for a given record (model) by accessing the `.y_pred()` method.\n", + "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", + "- The function body calculates the confusion matrix using the `sklearn.tests.confusion_matrix` function as we just did above.\n", + "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", + "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)\n", + "\n", + "Similarly, you can use the functinality provided by `VMDataset` and `VMModel` objects. You can refer our documentation page for all the avalialble APIs [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn import metrics\n", + "import matplotlib.pyplot as plt\n", + "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", + "def confusion_matrix(dataset, model):\n", + " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", + "\n", + " The confusion matrix is a 2x2 table that contains 4 values:\n", + "\n", + " - True Positive (TP): the number of correct positive predictions\n", + " - True Negative (TN): the number of correct negative predictions\n", + " - False Positive (FP): the number of incorrect positive predictions\n", + " - False Negative (FN): the number of incorrect negative predictions\n", + "\n", + " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", + " \"\"\"\n", + " # we can retrieve traget value from dataset which is y attribute\n", + " y_true = dataset.y\n", + " # The prediction value of a specific model using y_pred method \n", + " y_pred = dataset.y_pred(model=model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + " plt.close()\n", + "\n", + " return cm_display.figure_ # return the figure object itself" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, we run test using two inputs; `dataset` and `model`. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id = \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log the test results\n", + "\n", + "You can log any test result to the ValidMind Platform with the `.log()` method of the result object. This will allow you to add the result to the documentation.\n", + "\n", + "You can now do the same for the confusion matrix results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this notebook you have learned the end-to-end process to document a model with the ValidMind Library, running through some very common scenarios in a typical model development setting:\n", + "\n", + "- Running out-of-the-box tests\n", + "- Documenting your model by adding evidence to model documentation\n", + "- Extending the capabilities of the ValidMind Library by implementing custom tests\n", + "- Ensuring that the documentation is complete by running all tests in the documentation template" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-9be1890525a54c10be782f80fe33833f" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 271b98727..c1465096b 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -1,969 +1,975 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Log metrics over time\n", - "\n", - "Learn how to track and visualize the temporal evolution of key record (model) performance metrics with ValidMind.\n", - "\n", - "While this notebook uses a traditional binary classification model to demonstrate, the same principles apply to logging performance metrics over time for any record (model) type registered with ValidMind — including agentic AI systems, generative LLM applications, and beyond. For example:\n", - "\n", - "- Key model performance metrics such as AUC, F1 score, precision, recall, and accuracy, are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time.\n", - "- By monitoring these metrics systematically, teams can detect early warning signs of model drift and take proactive measures to maintain model reliability.\n", - "- Unit metrics in ValidMind provide a standardized way to compute and track individual performance measures, making it easy to monitor specific aspects of model behavior.\n", - "\n", - "Log metrics over time with the ValidMind Library's [`log_metric()`](https://docs.validmind.ai/validmind/validmind.html#log_metric) function and visualize them in your documentation using the *Metric Over Time* block within the ValidMind Platform. This integration enables seamless tracking of record performance, supporting custom thresholds and facilitating the automation of alerts based on logged metrics.\n", - "\n", - "
Metrics over time are most commonly associated with the continued monitoring of a records's performance once it is deployed.\n", - "

\n", - "While you are able to add Metric Over Time blocks to documentation, we recommend first enabling ongoing monitoring for your record to maximize the potential of your performance data.
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - "- [Load demo model](#toc3__) \n", - "- [Logging metrics](#toc4__) \n", - " - [Run unit metrics](#toc4_1__) \n", - " - [Log unit metrics over time](#toc4_2__) \n", - " - [Pass thresholds](#toc4_3__) \n", - " - [Log multiple metrics with custom thresholds](#toc4_4__) \n", - " - [Add acceptable performance flag](#toc4_5__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your model documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import numpy as np\n", - "\n", - "from datetime import datetime, timedelta\n", - "\n", - "from validmind.unit_metrics import list_metrics, describe_metric, run_metric\n", - "from validmind.api_client import log_metric\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load demo model\n", - "\n", - "We'll use a classification model trained on customer churn data to demonstrate ValidMind's metric logging capabilities.\n", - "\n", - "- We'll employ a built-in classification dataset, process it through train-validation-test splits, and train an XGBoost classifier.\n", - "- The trained model and datasets are then initialized in ValidMind's framework, enabling us to track and monitor various performance metrics in the following sections." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once the datasets and model are prepared for validation, let's initialize the ValidMind `dataset` and `model`, specifying features and targets columns.\n", - "\n", - "- The property `input_id` allows users to uniquely identify each dataset and model.\n", - "- This allows for the creation of multiple versions of datasets and models, enabling us to compute metrics by specifying which versions we want to use as inputs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df, input_id=\"test_dataset\", target_column=customer_churn.target_column\n", - ")\n", - "\n", - "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", - "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. \n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Logging metrics\n", - "\n", - "Next, we'll use ValidMind to track the temporal evolution of key model performance metrics.\n", - "\n", - "We'll set appropriate thresholds for each metric, enable automated alerting when performance drifts beyond acceptable boundaries, and demonstrate how these thresholds can be customized based on business requirements and risk tolerance levels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", - "\n", - "for metric_id in metrics:\n", - " describe_metric(metric_id)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run unit metrics\n", - "\n", - "Compute individual metrics using ValidMind's *unit metrics* — single-value metrics that can be computed on a dataset and model. Use the `run_metric()` function from the `validmind.unit_metrics` module to calculate these metrics.\n", - "\n", - "The `run_metric()` function has a signature similar to `run_test()` from the `validmind.tests` module, but is specifically designed for unit metrics and takes the following arguments:\n", - "\n", - "- **`metric_id`:** The unique identifier for the metric (for example, `validmind.unit_metrics.classification.ROC_AUC`)\n", - "- **`inputs`:** A dictionary containing the input dataset and model or their respective input IDs\n", - "- **`params`:** A dictionary containing keyword arguments for the unit metric (optional, accepts any `kwargs` from the underlying sklearn implementation)\n", - "\n", - "`run_metric()` returns and displays a result object similar to a regular ValidMind test, but only shows the unit metric value. While this result object has a `.log()` method for logging to the ValidMind Platform, in this use case we'll use unit metrics to compute performance metrics and then log them over time using the `log_metric()` function from the `validmind.api_client` module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.ROC_AUC\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "auc = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Accuracy\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "accuracy = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Recall\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "recall = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "f1 = run_metric(\n", - " \"validmind.unit_metrics.classification.F1\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "f1 = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "precision = run_metric(\n", - " \"validmind.unit_metrics.classification.Precision\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "precision = result.metric" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log unit metrics over time\n", - "\n", - "Using the `log_metric()` function from the `validmind.api_client` module, let's log the unit metrics over time. This function takes the following arguments:\n", - "\n", - "- **`key`:** The name of the metric to log\n", - "- **`value`:** The value of the metric to log\n", - "- **`recorded_at`:** The timestamp of the metric to log — useful for logging historic predictions\n", - "- **`thresholds`:** A dictionary containing the thresholds for the metric to log\n", - "- **`params`:** A dictionary containing the keyword arguments for the unit metric (in this case, none are required, but we can pass any `kwargs` that the underlying sklearn implementation accepts)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " # If `recorded_at` is not included, the time at function run is logged\n", - " recorded_at=datetime(2024, 1, 1), \n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To visualize the logged metric, we'll use the **[Metrics Over Time block](https://docs.validmind.ai/guide/monitoring/work-with-metrics-over-time.html)** in the ValidMind Platform:\n", - "\n", - "- After adding this visualization block to your documentation or ongoing monitoring report (as shown in the image below), you'll be able to review your logged metrics plotted over time.\n", - "- In this example, since we've only logged a single data point, the visualization shows just one measurement.\n", - "- As you continue logging metrics, the graph will populate with more points, enabling you to track trends and patterns.\n", - "\n", - "![Metric Over Time block](./add_metric_over_time_block.png)\n", - "![AUC Score](./log_metric_auc_1.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Pass thresholds\n", - "\n", - "We can pass *thresholds* to the `log_metric()` function to enhance the metric over time: \n", - "\n", - "- This is useful for visualizing the metric over time and identifying potential issues. \n", - "- The metric visualization component provides a dynamic way to monitor and contextualize metric values through customizable thresholds. \n", - "- These thresholds appear as horizontal reference lines on the chart. \n", - "- The system always displays the most recent threshold configuration, meaning that if you update threshold values in your client application, the visualization will reflect these changes immediately. \n", - "\n", - "When a metric is logged without thresholds or with an empty threshold dictionary, the reference lines gracefully disappear from the chart, though the metric line itself remains visible. \n", - "\n", - "Thresholds are highly flexible in their implementation. You can define them with any meaningful key names (such as `low_risk`, `maximum`, `target`, or `acceptable_range`) in your metric data, and the visualization will adapt accordingly. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " recorded_at=datetime(2024, 1, 1),\n", - " thresholds={\n", - " \"min_auc\": 0.7,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_2.png)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " recorded_at=datetime(2024, 1, 1),\n", - " thresholds={\n", - " \"high_risk\": 0.6,\n", - " \"medium_risk\": 0.7,\n", - " \"low_risk\": 0.8,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_3.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log multiple metrics with custom thresholds\n", - "\n", - "The following code snippet shows an example of how to set up and log multiple performance metrics with custom thresholds for each metric:\n", - "\n", - "- Using AUC, F1, Precision, Recall, and Accuracy scores as examples, it demonstrates how to define different risk levels (high, medium, low) appropriate for each metric's expected range.\n", - "- The code simulates 10 days of metric history by applying a gradual decay and random noise to help visualize how metrics might drift over time in a production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "NUM_DAYS = 10\n", - "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", - "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", - "\n", - "# Initial values with their specific thresholds\n", - "performance_metrics = {\n", - " \"AUC Score\": {\n", - " \"value\": auc,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.7,\n", - " \"medium_risk\": 0.8,\n", - " \"low_risk\": 0.9,\n", - " }\n", - " },\n", - " \"F1 Score\": {\n", - " \"value\": f1,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.7,\n", - " }\n", - " },\n", - " \"Precision Score\": {\n", - " \"value\": precision,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.6,\n", - " \"medium_risk\": 0.7,\n", - " \"low_risk\": 0.8,\n", - " }\n", - " },\n", - " \"Recall Score\": {\n", - " \"value\": recall,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.4,\n", - " \"medium_risk\": 0.5,\n", - " \"low_risk\": 0.6,\n", - " }\n", - " },\n", - " \"Accuracy Score\": {\n", - " \"value\": accuracy,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.75,\n", - " \"medium_risk\": 0.8,\n", - " \"low_risk\": 0.85,\n", - " }\n", - " }\n", - "}\n", - "\n", - "# Trend parameters\n", - "trend_factor = 0.98 # Slight downward trend\n", - "noise_scale = 0.02 # Random fluctuation of ±2%\n", - "\n", - "for i in range(NUM_DAYS):\n", - " recorded_at = base_date + timedelta(days=i)\n", - " print(f\"\\nrecorded_at: {recorded_at}\")\n", - "\n", - " # Log each metric with trend and noise\n", - " for metric_name, metric_info in performance_metrics.items():\n", - " base_value = metric_info[\"value\"]\n", - " thresholds = metric_info[\"thresholds\"]\n", - " \n", - " # Apply trend and add random noise\n", - " trend = base_value * (trend_factor ** i)\n", - " noise = np.random.normal(0, noise_scale * base_value)\n", - " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", - " \n", - " log_metric(\n", - " key=metric_name,\n", - " value=value,\n", - " recorded_at=recorded_at.isoformat(),\n", - " thresholds=thresholds\n", - " )\n", - " \n", - " print(f\"{metric_name:<15}: {value:.4f} (Thresholds: {thresholds})\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_4.png)\n", - "![Accuracy Score](./log_metric_accuracy.png)\n", - "![Precision Score](./log_metric_precision.png)\n", - "![Recall Score](./log_metric_recall.png)\n", - "![F1 Score](./log_metric_f1.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Add acceptable performance flag\n", - "\n", - "The `passed` parameter in the `log_metric()` function allows you to explicitly mark whether a specific metric value should be considered \"Satisfactory\" or \"Requires Attention\":\n", - " - When `passed=True`: A green \"Satisfactory\" badge appears on the chart, indicating the metric value meets your acceptance criteria.\n", - " - When `passed=False`: A yellow \"Requires Attention\" badge appears, highlighting potential concerns that may require investigation." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the example below, the `passed=True` parameter adds a green \"Satisfactory\" badge to the GINI Score metric visualization, instantly indicating that the 0.75 value meets acceptable performance standards by being above the `medium_risk` threshold of 0.6:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=0.75,\n", - " recorded_at=datetime(2025, 6, 7),\n", - " thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - " },\n", - " passed=True\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_satisfactory.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this example, the `passed=False` parameter adds a yellow \"Requires Attention\" badge to the GINI Score metric visualization, immediately highlighting that the value of 0.5 fails to meet acceptable performance standards by not exceeding the `medium_risk` threshold of 0.6:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=0.5,\n", - " recorded_at=datetime(2025, 6, 9),\n", - " thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - " },\n", - " passed=False\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_attention.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, a custom function `passed_fn` determines the badge status automatically, displaying a green \"Satisfactory\" badge for the 0.65 GINI Score because it exceeds the `medium_risk` threshold of 0.6, enabling programmatic evaluation of metric performance based on predefined business rules:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gini = 0.65\n", - "\n", - "thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - "}\n", - "\n", - "def passed_fn(value):\n", - " return value > thresholds[\"medium_risk\"]\n", - "\n", - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=gini, \n", - " recorded_at=datetime(2025, 6, 10),\n", - " thresholds=thresholds,\n", - " passed=passed_fn(gini)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_satisfactory_2.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-584966fafc334aec9585d8f880ddba0c", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Log metrics over time\n", + "\n", + "Learn how to track and visualize the temporal evolution of key record (model) performance metrics with ValidMind.\n", + "\n", + "While this notebook uses a traditional binary classification model to demonstrate, the same principles apply to logging performance metrics over time for any record (model) type registered with ValidMind — including agentic AI systems, generative LLM applications, and beyond. For example:\n", + "\n", + "- Key model performance metrics such as AUC, F1 score, precision, recall, and accuracy, are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time.\n", + "- By monitoring these metrics systematically, teams can detect early warning signs of model drift and take proactive measures to maintain model reliability.\n", + "- Unit metrics in ValidMind provide a standardized way to compute and track individual performance measures, making it easy to monitor specific aspects of model behavior.\n", + "\n", + "Log metrics over time with the ValidMind Library's [`log_metric()`](https://docs.validmind.ai/validmind/validmind.html#log_metric) function and visualize them in your documentation using the *Metric Over Time* block within the ValidMind Platform. This integration enables seamless tracking of record performance, supporting custom thresholds and facilitating the automation of alerts based on logged metrics.\n", + "\n", + "
Metrics over time are most commonly associated with the continued monitoring of a records's performance once it is deployed.\n", + "

\n", + "While you are able to add Metric Over Time blocks to documentation, we recommend first enabling ongoing monitoring for your record to maximize the potential of your performance data.
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + "- [Load demo model](#toc3__) \n", + "- [Logging metrics](#toc4__) \n", + " - [Run unit metrics](#toc4_1__) \n", + " - [Log unit metrics over time](#toc4_2__) \n", + " - [Pass thresholds](#toc4_3__) \n", + " - [Log multiple metrics with custom thresholds](#toc4_4__) \n", + " - [Add acceptable performance flag](#toc4_5__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your model documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import numpy as np\n", + "\n", + "from datetime import datetime, timedelta\n", + "\n", + "from validmind.unit_metrics import list_metrics, describe_metric, run_metric\n", + "from validmind.api_client import log_metric\n", + "\n", + "%matplotlib inline" + ], + "execution_count": 3, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load demo model\n", + "\n", + "We'll use a classification model trained on customer churn data to demonstrate ValidMind's metric logging capabilities.\n", + "\n", + "- We'll employ a built-in classification dataset, process it through train-validation-test splits, and train an XGBoost classifier.\n", + "- The trained model and datasets are then initialized in ValidMind's framework, enabling us to track and monitor various performance metrics in the following sections." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once the datasets and model are prepared for validation, let's initialize the ValidMind `dataset` and `model`, specifying features and targets columns.\n", + "\n", + "- The property `input_id` allows users to uniquely identify each dataset and model.\n", + "- This allows for the creation of multiple versions of datasets and models, enabling us to compute metrics by specifying which versions we want to use as inputs." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df, input_id=\"test_dataset\", target_column=customer_churn.target_column\n", + ")\n", + "\n", + "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", + "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. \n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Logging metrics\n", + "\n", + "Next, we'll use ValidMind to track the temporal evolution of key model performance metrics.\n", + "\n", + "We'll set appropriate thresholds for each metric, enable automated alerting when performance drifts beyond acceptable boundaries, and demonstrate how these thresholds can be customized based on business requirements and risk tolerance levels." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", + "\n", + "for metric_id in metrics:\n", + " describe_metric(metric_id)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run unit metrics\n", + "\n", + "Compute individual metrics using ValidMind's *unit metrics* — single-value metrics that can be computed on a dataset and model. Use the `run_metric()` function from the `validmind.unit_metrics` module to calculate these metrics.\n", + "\n", + "The `run_metric()` function has a signature similar to `run_test()` from the `validmind.tests` module, but is specifically designed for unit metrics and takes the following arguments:\n", + "\n", + "- **`metric_id`:** The unique identifier for the metric (for example, `validmind.unit_metrics.classification.ROC_AUC`)\n", + "- **`inputs`:** A dictionary containing the input dataset and model or their respective input IDs\n", + "- **`params`:** A dictionary containing keyword arguments for the unit metric (optional, accepts any `kwargs` from the underlying sklearn implementation)\n", + "\n", + "`run_metric()` returns and displays a result object similar to a regular ValidMind test, but only shows the unit metric value. While this result object has a `.log()` method for logging to the ValidMind Platform, in this use case we'll use unit metrics to compute performance metrics and then log them over time using the `log_metric()` function from the `validmind.api_client` module." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.ROC_AUC\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "auc = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Accuracy\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "accuracy = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Recall\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "recall = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "f1 = run_metric(\n", + " \"validmind.unit_metrics.classification.F1\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "f1 = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "precision = run_metric(\n", + " \"validmind.unit_metrics.classification.Precision\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "precision = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log unit metrics over time\n", + "\n", + "Using the `log_metric()` function from the `validmind.api_client` module, let's log the unit metrics over time. This function takes the following arguments:\n", + "\n", + "- **`key`:** The name of the metric to log\n", + "- **`value`:** The value of the metric to log\n", + "- **`recorded_at`:** The timestamp of the metric to log — useful for logging historic predictions\n", + "- **`thresholds`:** A dictionary containing the thresholds for the metric to log\n", + "- **`params`:** A dictionary containing the keyword arguments for the unit metric (in this case, none are required, but we can pass any `kwargs` that the underlying sklearn implementation accepts)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " # If `recorded_at` is not included, the time at function run is logged\n", + " recorded_at=datetime(2024, 1, 1), \n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To visualize the logged metric, we'll use the **[Metrics Over Time block](https://docs.validmind.ai/guide/monitoring/work-with-metrics-over-time.html)** in the ValidMind Platform:\n", + "\n", + "- After adding this visualization block to your documentation or ongoing monitoring report (as shown in the image below), you'll be able to review your logged metrics plotted over time.\n", + "- In this example, since we've only logged a single data point, the visualization shows just one measurement.\n", + "- As you continue logging metrics, the graph will populate with more points, enabling you to track trends and patterns.\n", + "\n", + "![Metric Over Time block](./add_metric_over_time_block.png)\n", + "![AUC Score](./log_metric_auc_1.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Pass thresholds\n", + "\n", + "We can pass *thresholds* to the `log_metric()` function to enhance the metric over time: \n", + "\n", + "- This is useful for visualizing the metric over time and identifying potential issues. \n", + "- The metric visualization component provides a dynamic way to monitor and contextualize metric values through customizable thresholds. \n", + "- These thresholds appear as horizontal reference lines on the chart. \n", + "- The system always displays the most recent threshold configuration, meaning that if you update threshold values in your client application, the visualization will reflect these changes immediately. \n", + "\n", + "When a metric is logged without thresholds or with an empty threshold dictionary, the reference lines gracefully disappear from the chart, though the metric line itself remains visible. \n", + "\n", + "Thresholds are highly flexible in their implementation. You can define them with any meaningful key names (such as `low_risk`, `maximum`, `target`, or `acceptable_range`) in your metric data, and the visualization will adapt accordingly. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " recorded_at=datetime(2024, 1, 1),\n", + " thresholds={\n", + " \"min_auc\": 0.7,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_2.png)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " recorded_at=datetime(2024, 1, 1),\n", + " thresholds={\n", + " \"high_risk\": 0.6,\n", + " \"medium_risk\": 0.7,\n", + " \"low_risk\": 0.8,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_3.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log multiple metrics with custom thresholds\n", + "\n", + "The following code snippet shows an example of how to set up and log multiple performance metrics with custom thresholds for each metric:\n", + "\n", + "- Using AUC, F1, Precision, Recall, and Accuracy scores as examples, it demonstrates how to define different risk levels (high, medium, low) appropriate for each metric's expected range.\n", + "- The code simulates 10 days of metric history by applying a gradual decay and random noise to help visualize how metrics might drift over time in a production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "NUM_DAYS = 10\n", + "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", + "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", + "\n", + "# Initial values with their specific thresholds\n", + "performance_metrics = {\n", + " \"AUC Score\": {\n", + " \"value\": auc,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.7,\n", + " \"medium_risk\": 0.8,\n", + " \"low_risk\": 0.9,\n", + " }\n", + " },\n", + " \"F1 Score\": {\n", + " \"value\": f1,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.7,\n", + " }\n", + " },\n", + " \"Precision Score\": {\n", + " \"value\": precision,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.6,\n", + " \"medium_risk\": 0.7,\n", + " \"low_risk\": 0.8,\n", + " }\n", + " },\n", + " \"Recall Score\": {\n", + " \"value\": recall,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.4,\n", + " \"medium_risk\": 0.5,\n", + " \"low_risk\": 0.6,\n", + " }\n", + " },\n", + " \"Accuracy Score\": {\n", + " \"value\": accuracy,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.75,\n", + " \"medium_risk\": 0.8,\n", + " \"low_risk\": 0.85,\n", + " }\n", + " }\n", + "}\n", + "\n", + "# Trend parameters\n", + "trend_factor = 0.98 # Slight downward trend\n", + "noise_scale = 0.02 # Random fluctuation of ±2%\n", + "\n", + "for i in range(NUM_DAYS):\n", + " recorded_at = base_date + timedelta(days=i)\n", + " print(f\"\\nrecorded_at: {recorded_at}\")\n", + "\n", + " # Log each metric with trend and noise\n", + " for metric_name, metric_info in performance_metrics.items():\n", + " base_value = metric_info[\"value\"]\n", + " thresholds = metric_info[\"thresholds\"]\n", + " \n", + " # Apply trend and add random noise\n", + " trend = base_value * (trend_factor ** i)\n", + " noise = np.random.normal(0, noise_scale * base_value)\n", + " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", + " \n", + " log_metric(\n", + " key=metric_name,\n", + " value=value,\n", + " recorded_at=recorded_at.isoformat(),\n", + " thresholds=thresholds\n", + " )\n", + " \n", + " print(f\"{metric_name:<15}: {value:.4f} (Thresholds: {thresholds})\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_4.png)\n", + "![Accuracy Score](./log_metric_accuracy.png)\n", + "![Precision Score](./log_metric_precision.png)\n", + "![Recall Score](./log_metric_recall.png)\n", + "![F1 Score](./log_metric_f1.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Add acceptable performance flag\n", + "\n", + "The `passed` parameter in the `log_metric()` function allows you to explicitly mark whether a specific metric value should be considered \"Satisfactory\" or \"Requires Attention\":\n", + " - When `passed=True`: A green \"Satisfactory\" badge appears on the chart, indicating the metric value meets your acceptance criteria.\n", + " - When `passed=False`: A yellow \"Requires Attention\" badge appears, highlighting potential concerns that may require investigation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the example below, the `passed=True` parameter adds a green \"Satisfactory\" badge to the GINI Score metric visualization, instantly indicating that the 0.75 value meets acceptable performance standards by being above the `medium_risk` threshold of 0.6:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=0.75,\n", + " recorded_at=datetime(2025, 6, 7),\n", + " thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + " },\n", + " passed=True\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_satisfactory.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the `passed=False` parameter adds a yellow \"Requires Attention\" badge to the GINI Score metric visualization, immediately highlighting that the value of 0.5 fails to meet acceptable performance standards by not exceeding the `medium_risk` threshold of 0.6:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=0.5,\n", + " recorded_at=datetime(2025, 6, 9),\n", + " thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + " },\n", + " passed=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_attention.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, a custom function `passed_fn` determines the badge status automatically, displaying a green \"Satisfactory\" badge for the 0.65 GINI Score because it exceeds the `medium_risk` threshold of 0.6, enabling programmatic evaluation of metric performance based on predefined business rules:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "gini = 0.65\n", + "\n", + "thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + "}\n", + "\n", + "def passed_fn(value):\n", + " return value > thresholds[\"medium_risk\"]\n", + "\n", + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=gini, \n", + " recorded_at=datetime(2025, 6, 10),\n", + " thresholds=thresholds,\n", + " passed=passed_fn(gini)\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_satisfactory_2.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-584966fafc334aec9585d8f880ddba0c" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index a82760597..7c86798e7 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -1,962 +1,970 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "9a900020", - "metadata": {}, - "source": [ - "# Generate qualitative text with the ValidMind library\n", - "\n", - "This notebook shows how to generate qualitative documentation content directly from the ValidMind library using both `vm.run_text_generation()` and `vm.generate_documentation_text()`. Instead of switching to the UI to write text manually or trigger generation one section at a time, you can generate content for documentation text blocks programmatically from within a notebook and log it back to the corresponding sections of the model document.\n", - "\n", - "After building an example model and documenting its quantitative results, we’ll show how to generate text for individual content blocks, customize the output with prompts, control the context used for generation, and use a configuration-driven workflow to populate multiple qualitative sections across the document. By the end, you’ll have an end-to-end example of how quantitative test results and AI-generated qualitative content can work together to populate a full model document from Python, giving you a more automated documentation workflow directly in the library." - ] - }, - { - "cell_type": "markdown", - "id": "cd48db57", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - "- [Getting to know ValidMind](#toc3__) \n", - " - [Preview the documentation template](#toc3_1__) \n", - " - [View model documentation in the ValidMind Platform](#toc3_2__) \n", - "- [Build the example model](#toc4__) \n", - " - [Import the sample dataset](#toc4_1__) \n", - " - [Preprocessing the raw dataset](#toc4_2__) \n", - " - [Training an XGBoost classifier model](#toc4_3__) \n", - "- [Initialize the ValidMind inputs](#toc5__) \n", - "- [Document test results](#toc6__) \n", - "- [Document qualitative sections](#toc7__) \n", - " - [Generate text for a single content block](#toc7_1__) \n", - " - [Customize the prompt](#toc7_2__) \n", - " - [Pass section-specific context](#toc7_3__) \n", - " - [Append a new text block to a section](#toc7_4__) \n", - " - [Generate text across the document](#toc7_5__) \n", - "- [In summary](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "id": "a67217b3", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "281cfb86", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "51c11b52", - "metadata": {}, - "source": [ - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "id": "9103cd45", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "23020a1b", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "6202d6dc", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "
Recommended Python versions\n", - "

\n", - "Python 3.8 <= x <= 3.14
\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "045b05a6", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b3231d8e", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "56592217", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "43ed3d0c", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "9b9203be", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "690dc368", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " api_host=\"http://localhost:5000/api/v1/tracking\",\n", - " api_key=\"..\",\n", - " api_secret=\"..\",\n", - " document=\"documentation\", # requires library >=2.12.0\n", - " model=\"..\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a68f6031", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", - "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fa2d9de", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "id": "69a37995", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "40c9eb24", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62842e84", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "6fab1c1c", - "metadata": {}, - "source": [ - "\n", - "\n", - "### View model documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "606d932b", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Build the example model" - ] - }, - { - "cell_type": "markdown", - "id": "3d7ad25a", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Import the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that: \n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8ea8188e", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "a5ceef72", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preprocessing the raw dataset\n", - "\n", - "In this section, we preprocess the raw dataset so it is ready for model training and validation. This includes splitting the data into training, validation, and test subsets to support both model fitting and evaluation on unseen data, and then separating each subset into input features and target labels so the model can learn from customer attributes and predict whether a customer churned." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9d2bec58", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "3b9edacf", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Training an XGBoost classifier model\n", - "\n", - "In this section, we train an XGBoost classifier to predict customer churn, using early stopping to halt training if performance does not improve after 10 rounds and reduce unnecessary fitting. We configure the model to evaluate performance with three complementary metrics: error for incorrect predictions, logloss for prediction confidence, and auc for class separation. The model is trained on the training split and evaluated against the validation split during fitting, while verbose=False keeps the training output concise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "658447fc", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "c2a6b492", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Initialize the ValidMind inputs\n", - "\n", - "We begin by registering the datasets and trained model as ValidMind inputs so they can be referenced consistently throughout the documentation workflow. For the datasets, this means creating ValidMind Dataset objects for the raw, training, and testing data, each with a unique `input_id` for traceability. Where needed, we also provide supporting metadata such as the target column and class labels so tests can interpret the data correctly." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "081548ae", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1ebfda19", - "metadata": {}, - "source": [ - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6cc5aff8", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the model\n", - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "48d23cf8", - "metadata": {}, - "source": [ - "Finally, we assign predictions from the trained model to the training and testing datasets. The `assign_predictions()` method links predicted classes and probabilities to each dataset, and can also compute predictions automatically if they are not passed explicitly. This step is what allows ValidMind to run performance and diagnostic tests using the model outputs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "922baa9d", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "7c9a174d", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Document test results\n", - "\n", - "In this section, we run the documentation tests defined by the applied template to populate the quantitative parts of the model documentation. The `vm.run_documentation_tests()` function discovers each test-driven block in the template, executes the corresponding tests, and uploads the resulting artifacts to the ValidMind Platform.\n", - "\n", - "To run the full suite successfully, ValidMind needs to know which model and dataset inputs should be used for each test. This can be done with a shared `inputs` argument when all tests use the same objects, or with a `config` dictionary when individual tests require specific inputs or parameters. In this example, we use the default test parameters and provide the input configuration needed for the demo model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "47f7e709", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "3f22d37b", - "metadata": {}, - "source": [ - "Once the configuration is prepared, we pass it to `vm.run_documentation_tests()` and execute the full suite. The returned `full_suite` object contains the test results and represents the quantitative documentation that has been generated for the model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "999be7fe", - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "5d531744", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Document qualitative sections\n", - "\n", - "In addition to documenting quantitative results through tests, ValidMind now supports programmatic generation of qualitative content for the text blocks in a model documentation template through `vm.run_text_generation()`. This function allows you to generate AI-assisted text for a specific content block directly from a notebook and then log it back to the corresponding section of the document. As a result, you can populate qualitative sections without switching to the UI to write text manually or trigger generation one section at a time.\n", - "\n", - "In the next sections, we’ll walk through the main ways to use this functionality. We’ll start by generating text for a single content block with the default behavior, then show how to customize the output with a prompt, how to control the context used for generation by selecting specific sections, and finally how to scale the same pattern across all text blocks in the document." - ] - }, - { - "cell_type": "markdown", - "id": "899c8553", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Generate text for a single content block\n", - "\n", - "First, we’ll use `vm.run_text_generation()` to generate qualitative text for a single documentation block. By providing a `content_id`, you can target the exact text placeholder you want to populate and let ValidMind generate content using the current document context. The helper `vm.get_content_ids()` is useful for inspecting which content blocks are available in the active template, making it easier to identify the IDs you can use when generating and logging text programmatically." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85cc552f", - "metadata": {}, - "outputs": [], - "source": [ - "vm.get_content_ids()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "26fcddf9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "caff6490", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Customize the prompt\n", - "\n", - "Next, we’ll customize the generated output by passing a `prompt` to `vm.run_text_generation()`. This makes it possible to guide not just the subject of the generated text, but also its structure, tone, level of detail, and presentation format. In practice, this allows you to tailor the output for different documentation needs, such as producing a short narrative summary, a more structured section, or content written for a specific audience, while still relying on the same underlying document context for generation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "52165b98", - "metadata": {}, - "outputs": [], - "source": [ - "prompt = \"\"\"\n", - "Use exactly this structure:\n", - "\n", - "

Dataset Overview

\n", - "

Explain in 1-2 sentences what the dataset contains and what it is used for.

\n", - "\n", - "

Dataset Summary

\n", - "

Summarize the dataset structure, target outcome, and the main types of input features in 2-3 sentences.

\n", - "\n", - "

Key Characteristics

\n", - "
    \n", - "
  • Include 2-3 concise points about the most important characteristics of the dataset.
  • \n", - "
\n", - "\n", - "

Data Quality and Considerations

\n", - "
    \n", - "
  • Include 2-3 concise points about important quality observations, limitations, or considerations relevant to the dataset.
  • \n", - "
\n", - "\n", - "

Overall Assessment

\n", - "

End with a short balanced conclusion on the dataset's suitability for model development and evaluation.

\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fbf10ad9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - " prompt=prompt,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "99a0740e", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Pass section-specific context\n", - "\n", - "Then, we’ll control the `context` used for generation by passing a selected set of content IDs to `vm.run_text_generation()`. Rather than relying on the full document, this lets you focus the model on the most relevant parts of the documentation for a given text block. In practice, that means you can generate more targeted qualitative content by choosing which existing test and text blocks should inform the output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "43cf0e7d", - "metadata": {}, - "outputs": [], - "source": [ - "vm.get_content_ids(\"data_description\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e1a919e", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - " context={\"content_ids\": vm.get_content_ids(\"data_description\")},\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "701a0323", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Append a new text block to a section\n", - "\n", - "Sometimes you may want to generate text for a `content_id` that is not already defined in the template. In that case, you can still generate the text with `vm.run_text_generation()` and then use `.log(section_id=...)` to tell ValidMind where that new text block should be placed in the document. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6a9ba924", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"intended_use\",\n", - " section_id=\"intended_use\",\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "6e032b79", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Generate text across the document\n", - "\n", - "At this stage, instead of generating one block at a time, we can populate multiple qualitative sections in a single pass.\n", - "\n", - "The [`vm.generate_documentation_text`](https://docs.validmind.ai/validmind/validmind.html#generate_documentation_text) function reads a configuration dictionary, generates content for each target block, logs the generated text to the ValidMind Platform, and returns a notebook summary grouped by section.\n", - "\n", - "- The function uses a `config` argument to describe which text blocks to generate and how each one should be handled.\n", - "- The `config` parameter is a dictionary with the following structure:\n", - "\n", - " ```python\n", - " config = {\n", - " \"\": {\n", - " \"section_id\": \"\",\n", - " \"prompt\": \"Optional custom prompt\",\n", - " \"context\": {\n", - " \"content_ids\": [\"\", \"\"]\n", - " }\n", - " },\n", - " ...\n", - " }\n", - " ```\n", - "\n", - " Each `` represents a documentation text block to populate. Use `section_id` when the block should be inserted into a specific section, `prompt` when you want to shape the output more explicitly, and `context.content_ids` when you want the generation step to focus on selected parts of the document. In this notebook, `text_config` comes from `customer_churn.get_demo_text_config()`, which provides the demo setup for the customer churn example." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a97bb129", - "metadata": {}, - "outputs": [], - "source": [ - "text_config = customer_churn.get_demo_text_config()\n", - "preview_test_config(text_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "aff42702", - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.generate_documentation_text(config=text_config)" - ] - }, - { - "cell_type": "markdown", - "id": "03b6b875", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Build and document an example customer churn model with ValidMind\n", - "- [x] Run documentation tests to populate the quantitative sections of a model document\n", - "- [x] Generate qualitative text for a single documentation content block with `vm.run_text_generation()`\n", - "- [x] Customize generated output by passing a prompt\n", - "- [x] Control generation context by selecting specific sections of the document\n", - "- [x] Use a configuration-driven workflow to generate qualitative content across the document with `vm.generate_documentation_text()`" - ] - }, - { - "cell_type": "markdown", - "id": "3db3c328", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "d7bd8df8", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "c0951457", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", - "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "24532182", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2e796c43", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "713a6722", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "84a65def", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-18d82030e09942c4953248e9bf432249", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.11" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Generate qualitative text with the ValidMind library\n", + "\n", + "This notebook shows how to generate qualitative documentation content directly from the ValidMind library using both `vm.run_text_generation()` and `vm.generate_documentation_text()`. Instead of switching to the UI to write text manually or trigger generation one section at a time, you can generate content for documentation text blocks programmatically from within a notebook and log it back to the corresponding sections of the model document.\n", + "\n", + "After building an example model and documenting its quantitative results, we’ll show how to generate text for individual content blocks, customize the output with prompts, control the context used for generation, and use a configuration-driven workflow to populate multiple qualitative sections across the document. By the end, you’ll have an end-to-end example of how quantitative test results and AI-generated qualitative content can work together to populate a full model document from Python, giving you a more automated documentation workflow directly in the library." + ], + "id": "9a900020" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + "- [Getting to know ValidMind](#toc3__) \n", + " - [Preview the documentation template](#toc3_1__) \n", + " - [View model documentation in the ValidMind Platform](#toc3_2__) \n", + "- [Build the example model](#toc4__) \n", + " - [Import the sample dataset](#toc4_1__) \n", + " - [Preprocessing the raw dataset](#toc4_2__) \n", + " - [Training an XGBoost classifier model](#toc4_3__) \n", + "- [Initialize the ValidMind inputs](#toc5__) \n", + "- [Document test results](#toc6__) \n", + "- [Document qualitative sections](#toc7__) \n", + " - [Generate text for a single content block](#toc7_1__) \n", + " - [Customize the prompt](#toc7_2__) \n", + " - [Pass section-specific context](#toc7_3__) \n", + " - [Append a new text block to a section](#toc7_4__) \n", + " - [Generate text across the document](#toc7_5__) \n", + "- [In summary](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "\n", + "" + ], + "id": "cd48db57" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "a67217b3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "281cfb86" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ], + "id": "51c11b52" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [test_suites](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "9103cd45" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ], + "id": "23020a1b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "
Recommended Python versions\n", + "

\n", + "Python 3.8 <= x <= 3.14
\n", + "\n", + "To install the library:" + ], + "id": "6202d6dc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "045b05a6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "b3231d8e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "56592217" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "43ed3d0c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "9b9203be" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " api_host=\"http://localhost:5000/api/v1/tracking\",\n", + " api_key=\"..\",\n", + " api_secret=\"..\",\n", + " document=\"documentation\", # requires library >=2.12.0\n", + " model=\"..\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "690dc368" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", + "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ], + "id": "a68f6031" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ], + "execution_count": null, + "outputs": [], + "id": "3fa2d9de" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "69a37995" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "40c9eb24" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "62842e84" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### View model documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ], + "id": "6fab1c1c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Build the example model" + ], + "id": "606d932b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Import the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that: \n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "3d7ad25a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "8ea8188e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preprocessing the raw dataset\n", + "\n", + "In this section, we preprocess the raw dataset so it is ready for model training and validation. This includes splitting the data into training, validation, and test subsets to support both model fitting and evaluation on unseen data, and then separating each subset into input features and target labels so the model can learn from customer attributes and predict whether a customer churned." + ], + "id": "a5ceef72" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "9d2bec58" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Training an XGBoost classifier model\n", + "\n", + "In this section, we train an XGBoost classifier to predict customer churn, using early stopping to halt training if performance does not improve after 10 rounds and reduce unnecessary fitting. We configure the model to evaluate performance with three complementary metrics: error for incorrect predictions, logloss for prediction confidence, and auc for class separation. The model is trained on the training split and evaluated against the validation split during fitting, while verbose=False keeps the training output concise." + ], + "id": "3b9edacf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "658447fc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Initialize the ValidMind inputs\n", + "\n", + "We begin by registering the datasets and trained model as ValidMind inputs so they can be referenced consistently throughout the documentation workflow. For the datasets, this means creating ValidMind Dataset objects for the raw, training, and testing data, each with a unique `input_id` for traceability. Where needed, we also provide supporting metadata such as the target column and class labels so tests can interpret the data correctly." + ], + "id": "c2a6b492" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "081548ae" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "1ebfda19" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the model\n", + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "6cc5aff8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we assign predictions from the trained model to the training and testing datasets. The `assign_predictions()` method links predicted classes and probabilities to each dataset, and can also compute predictions automatically if they are not passed explicitly. This step is what allows ValidMind to run performance and diagnostic tests using the model outputs." + ], + "id": "48d23cf8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "922baa9d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Document test results\n", + "\n", + "In this section, we run the documentation tests defined by the applied template to populate the quantitative parts of the model documentation. The `vm.run_documentation_tests()` function discovers each test-driven block in the template, executes the corresponding tests, and uploads the resulting artifacts to the ValidMind Platform.\n", + "\n", + "To run the full suite successfully, ValidMind needs to know which model and dataset inputs should be used for each test. This can be done with a shared `inputs` argument when all tests use the same objects, or with a `config` dictionary when individual tests require specific inputs or parameters. In this example, we use the default test parameters and provide the input configuration needed for the demo model." + ], + "id": "7c9a174d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [], + "id": "47f7e709" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once the configuration is prepared, we pass it to `vm.run_documentation_tests()` and execute the full suite. The returned `full_suite` object contains the test results and represents the quantitative documentation that has been generated for the model." + ], + "id": "3f22d37b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [], + "id": "999be7fe" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Document qualitative sections\n", + "\n", + "In addition to documenting quantitative results through tests, ValidMind now supports programmatic generation of qualitative content for the text blocks in a model documentation template through `vm.run_text_generation()`. This function allows you to generate AI-assisted text for a specific content block directly from a notebook and then log it back to the corresponding section of the document. As a result, you can populate qualitative sections without switching to the UI to write text manually or trigger generation one section at a time.\n", + "\n", + "In the next sections, we’ll walk through the main ways to use this functionality. We’ll start by generating text for a single content block with the default behavior, then show how to customize the output with a prompt, how to control the context used for generation by selecting specific sections, and finally how to scale the same pattern across all text blocks in the document." + ], + "id": "5d531744" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Generate text for a single content block\n", + "\n", + "First, we’ll use `vm.run_text_generation()` to generate qualitative text for a single documentation block. By providing a `content_id`, you can target the exact text placeholder you want to populate and let ValidMind generate content using the current document context. The helper `vm.get_content_ids()` is useful for inspecting which content blocks are available in the active template, making it easier to identify the IDs you can use when generating and logging text programmatically." + ], + "id": "899c8553" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.get_content_ids()" + ], + "execution_count": null, + "outputs": [], + "id": "85cc552f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "26fcddf9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Customize the prompt\n", + "\n", + "Next, we’ll customize the generated output by passing a `prompt` to `vm.run_text_generation()`. This makes it possible to guide not just the subject of the generated text, but also its structure, tone, level of detail, and presentation format. In practice, this allows you to tailor the output for different documentation needs, such as producing a short narrative summary, a more structured section, or content written for a specific audience, while still relying on the same underlying document context for generation." + ], + "id": "caff6490" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "prompt = \"\"\"\n", + "Use exactly this structure:\n", + "\n", + "

Dataset Overview

\n", + "

Explain in 1-2 sentences what the dataset contains and what it is used for.

\n", + "\n", + "

Dataset Summary

\n", + "

Summarize the dataset structure, target outcome, and the main types of input features in 2-3 sentences.

\n", + "\n", + "

Key Characteristics

\n", + "
    \n", + "
  • Include 2-3 concise points about the most important characteristics of the dataset.
  • \n", + "
\n", + "\n", + "

Data Quality and Considerations

\n", + "
    \n", + "
  • Include 2-3 concise points about important quality observations, limitations, or considerations relevant to the dataset.
  • \n", + "
\n", + "\n", + "

Overall Assessment

\n", + "

End with a short balanced conclusion on the dataset's suitability for model development and evaluation.

\n", + "\"\"\"" + ], + "execution_count": null, + "outputs": [], + "id": "52165b98" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + " prompt=prompt,\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "fbf10ad9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Pass section-specific context\n", + "\n", + "Then, we’ll control the `context` used for generation by passing a selected set of content IDs to `vm.run_text_generation()`. Rather than relying on the full document, this lets you focus the model on the most relevant parts of the documentation for a given text block. In practice, that means you can generate more targeted qualitative content by choosing which existing test and text blocks should inform the output." + ], + "id": "99a0740e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.get_content_ids(\"data_description\")" + ], + "execution_count": null, + "outputs": [], + "id": "43cf0e7d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + " context={\"content_ids\": vm.get_content_ids(\"data_description\")},\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "1e1a919e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Append a new text block to a section\n", + "\n", + "Sometimes you may want to generate text for a `content_id` that is not already defined in the template. In that case, you can still generate the text with `vm.run_text_generation()` and then use `.log(section_id=...)` to tell ValidMind where that new text block should be placed in the document. " + ], + "id": "701a0323" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_text_generation(\n", + " content_id=\"intended_use\",\n", + " section_id=\"intended_use\",\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "6a9ba924" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Generate text across the document\n", + "\n", + "At this stage, instead of generating one block at a time, we can populate multiple qualitative sections in a single pass.\n", + "\n", + "The [`vm.generate_documentation_text`](https://docs.validmind.ai/validmind/validmind.html#generate_documentation_text) function reads a configuration dictionary, generates content for each target block, logs the generated text to the ValidMind Platform, and returns a notebook summary grouped by section.\n", + "\n", + "- The function uses a `config` argument to describe which text blocks to generate and how each one should be handled.\n", + "- The `config` parameter is a dictionary with the following structure:\n", + "\n", + " ```python\n", + " config = {\n", + " \"\": {\n", + " \"section_id\": \"\",\n", + " \"prompt\": \"Optional custom prompt\",\n", + " \"context\": {\n", + " \"content_ids\": [\"\", \"\"]\n", + " }\n", + " },\n", + " ...\n", + " }\n", + " ```\n", + "\n", + " Each `` represents a documentation text block to populate. Use `section_id` when the block should be inserted into a specific section, `prompt` when you want to shape the output more explicitly, and `context.content_ids` when you want the generation step to focus on selected parts of the document. In this notebook, `text_config` comes from `customer_churn.get_demo_text_config()`, which provides the demo setup for the customer churn example." + ], + "id": "6e032b79" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "text_config = customer_churn.get_demo_text_config()\n", + "preview_test_config(text_config)" + ], + "execution_count": null, + "outputs": [], + "id": "a97bb129" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.generate_documentation_text(config=text_config)" + ], + "execution_count": null, + "outputs": [], + "id": "aff42702" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Build and document an example customer churn model with ValidMind\n", + "- [x] Run documentation tests to populate the quantitative sections of a model document\n", + "- [x] Generate qualitative text for a single documentation content block with `vm.run_text_generation()`\n", + "- [x] Customize generated output by passing a prompt\n", + "- [x] Control generation context by selecting specific sections of the document\n", + "- [x] Use a configuration-driven workflow to generate qualitative content across the document with `vm.generate_documentation_text()`" + ], + "id": "03b6b875" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ], + "id": "3db3c328" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ], + "id": "d7bd8df8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", + "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "c0951457" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "24532182" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "2e796c43" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "713a6722" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "84a65def" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-18d82030e09942c4953248e9bf432249" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index cd8af2d27..694296d08 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -1,1107 +1,1113 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Implement custom tests\n", - "\n", - "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", - "\n", - "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions.\n", - "\n", - "This interactive notebook provides a step-by-step guide for implementing and registering custom tests with ValidMind, running them individually, viewing the results on the ValidMind Platform, and incorporating them into your model documentation template." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Implement custom tests\n", + "\n", + "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", + "\n", + "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions.\n", + "\n", + "This interactive notebook provides a step-by-step guide for implementing and registering custom tests with ValidMind, running them individually, viewing the results on the ValidMind Platform, and incorporating them into your model documentation template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Implement a Custom Test](#toc3__) \n", + "- [Run the Custom Test](#toc4__) \n", + " - [Setup the Model and Dataset](#toc4_1__) \n", + " - [Run the Custom Test](#toc4_2__) \n", + "- [Adding Custom Test to Model Documentation](#toc5__) \n", + "- [Some More Custom Tests](#toc6__) \n", + " - [Custom Test: Table of Model Hyperparameters](#toc6_1__) \n", + " - [Custom Test: External API Call](#toc6_2__) \n", + " - [Custom Test: Passing Parameters](#toc6_3__) \n", + " - [Custom Test: Multiple Tables and Plots in a Single Test](#toc6_4__) \n", + " - [Custom Test: Images](#toc6_5__) \n", + " - [Custom Test: Description](#toc6_6__) \n", + "- [Conclusion](#toc7__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Implement a Custom Test\n", + "\n", + "Let's start off by creating a simple custom test that creates a Confusion Matrix for a binary classification model. We will use the `sklearn.metrics.confusion_matrix` function to calculate the confusion matrix and then display it as a heatmap using `plotly`. (This is already a built-in test in ValidMind, but we will use it as an example to demonstrate how to create custom tests.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import matplotlib.pyplot as plt\n", + "from sklearn import metrics\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", + "def confusion_matrix(dataset, model):\n", + " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", + "\n", + " The confusion matrix is a 2x2 table that contains 4 values:\n", + "\n", + " - True Positive (TP): the number of correct positive predictions\n", + " - True Negative (TN): the number of correct negative predictions\n", + " - False Positive (FP): the number of incorrect positive predictions\n", + " - False Negative (FN): the number of incorrect negative predictions\n", + "\n", + " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", + " \"\"\"\n", + " y_true = dataset.y\n", + " y_pred = dataset.y_pred(model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return cm_display.figure_ # return the figure object itself" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Thats our custom test defined and ready to go... Let's take a look at whats going on here:\n", + "\n", + "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a VMDataset and VMModel object respectively.\n", + "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", + "- The function body calculates the confusion matrix using the `sklearn.metrics.confusion_matrix` function and then plots it using `sklearn.metric.ConfusionMatrixDisplay`.\n", + "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", + "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Run the Custom Test\n", + "\n", + "Now that we have defined and registered our custom test, lets see how we can run it and properly use it in the ValidMind Platform." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Setup the Model and Dataset\n", + "\n", + "First let's setup a an example model and dataset to run our custom metic against. Since this is a Confusion Matrix, we will use the Customer Churn dataset that ValidMind provides and train a simple XGBoost model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Easy enough! Now we have a model and dataset setup and trained. One last thing to do is bring the dataset and model into the ValidMind Library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# for now, we'll just use the test dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " target_column=customer_churn.target_column,\n", + " input_id=\"test_dataset\",\n", + ")\n", + "\n", + "vm_model = vm.init_model(model, input_id=\"model\")\n", + "\n", + "# link the model to the dataset\n", + "vm_test_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run the Custom Test\n", + "\n", + "Now that we have our model and dataset setup, we have everything we need to run our custom test. We can do this by importing the `run_test` function from the `validmind.tests` module and passing in the test ID of our custom test along with the model and dataset we want to run it against.\n", + "\n", + ">Notice how the `inputs` dictionary is used to map an `input_id` which we set above to the `model` and `dataset` keys that are expected by our custom test function. This is how the ValidMind Library knows which inputs to pass to different tests and is key when using many different datasets and models." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You'll notice that the docstring becomes a markdown description of the test. The figure is then displayed as the test result. What you see above is how it will look in the ValidMind Platform as well. Let's go ahead and log the result to see how that works." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Adding Custom Test to Model Documentation\n", + "\n", + "To do this, go to the documentation page of the model you registered above and navigate to the `Model Development` -> `Model Evaluation` section. Then hover between any existing content block to reveal the `+` button as shown in the screenshot below.\n", + "\n", + "![screenshot showing insert button for test-driven blocks](./insert-test-driven-block.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now click on the `+` button and select the `Test-Driven Block` option. This will open a dialog where you can select `My Custom Tests Confusion Matrix` from the list of available tests. You can preview the result and then click `Insert Block` to add it to the documentation.\n", + "\n", + "![screenshot showing how to insert a test-driven block](./insert-test-driven-block-custom.png)\n", + "\n", + "The test should match the result you see above. It is now part of your documentation and will now be run everytime you run `vm.run_documentation_tests()` for your model. Let's do that now." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.reload()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you preview the template, it should show the custom test in the `Model Development`->`Model Evaluation` section:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just so we can run all of the tests in the template, let's initialize the train and raw dataset.\n", + "\n", + "(Refer to [**Quickstart for documentation**](../../../quickstart/quickstart_documentation.ipynb) and the ValidMind docs for more information on what we are doing here)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "vm_train_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To run all the tests in the template, you can use the `vm.run_documentation_tests()` and pass the inputs we initialized above and the demo config from our customer_churn module. We will have to add a section to the config for our new test to tell it which inputs it should receive. This is done by simply adding a new element in the config dictionary where the key is the ID of the test and the value is a dictionary with the following structure:\n", + "```python\n", + "{\n", + " \"inputs\": {\n", + " \"model\": \"test_dataset\",\n", + " \"dataset\": \"model\",\n", + " }\n", + "}\n", + "```" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", + " \"inputs\": {\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"model\",\n", + " }\n", + "}\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Some More Custom Tests\n", + "\n", + "Now that you understand the entire process of creating custom tests and using them in your documentation, let's create a few more to see different ways you can utilize custom tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Table of Model Hyperparameters\n", + "\n", + "This custom test will display a table of the hyperparameters used in the model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Hyperparameters\")\n", + "def hyperparameters(model):\n", + " \"\"\"The hyperparameters of a machine learning model are the settings that control the learning process.\n", + " These settings are specified before the learning process begins and can have a significant impact on the\n", + " performance of the model.\n", + "\n", + " The hyperparameters of a model can be used to tune the model to achieve the best possible performance\n", + " on a given dataset. By examining the hyperparameters of a model, you can gain insight into how the model\n", + " was trained and how it might be improved.\n", + " \"\"\"\n", + " hyperparameters = model.model.get_xgb_params() # dictionary of hyperparameters\n", + "\n", + " # turn the dictionary into a table where each row contains a hyperparameter and its value\n", + " return [{\"Hyperparam\": k, \"Value\": v} for k, v in hyperparameters.items() if v]\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.Hyperparameters\", inputs={\"model\": \"model\"})\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since the test has been run and logged, you can add it to your documentation using the same process as above. It should look like this:\n", + "\n", + "![screenshot showing hyperparameters test](./hyperparameters-custom-metric.png)\n", + "\n", + "For our simple toy model, there are aren't really any proper hyperparameters but you can see how this could be useful for more complex models that have gone through hyperparameter tuning." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: External API Call\n", + "\n", + "This custom test will make an external API call to get the current BTC price and display it as a table. This demonstrates how you might integrate external data sources into your model documentation in a programmatic way. You could, for instance, setup a pipeline that runs a test like this every day to keep your model documentation in sync with an external system." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import requests\n", + "import random\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ExternalAPI\")\n", + "def external_api():\n", + " \"\"\"This test calls an external API to get a list of fake users. It then creates\n", + " a table with the relevant data so it can be displayed in the documentation.\n", + "\n", + " The purpose of this test is to demonstrate how to call an external API and use the\n", + " data in a test. A test like this could even be setup to run in a scheduled\n", + " pipeline to keep your documentation in-sync with an external data source.\n", + " \"\"\"\n", + " url = \"https://jsonplaceholder.typicode.com/users\"\n", + " response = requests.get(url)\n", + " data = response.json()\n", + "\n", + " # extract the time and the current BTC price in USD\n", + " return {\n", + " \"Model Owners/Stakeholders\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": random.choice([\"Owner\", \"Stakeholder\"]),\n", + " \"Email\": user[\"email\"],\n", + " \"Phone\": user[\"phone\"],\n", + " \"Slack Handle\": f\"@{user['name'].lower().replace(' ', '.')}\",\n", + " }\n", + " for user in data[:3]\n", + " ],\n", + " \"Model Developers\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": \"Developer\",\n", + " \"Email\": user[\"email\"],\n", + " }\n", + " for user in data[3:7]\n", + " ],\n", + " \"Model Validators\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": \"Validator\",\n", + " \"Email\": user[\"email\"],\n", + " }\n", + " for user in data[7:]\n", + " ],\n", + " }\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.ExternalAPI\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Again, you can add this to your documentation to see how it looks:\n", + "\n", + "![screenshot showing BTC price metric](./external-data-custom-test.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Passing Parameters\n", + "\n", + "Custom test functions, as stated earlier, can take both inputs and params. When you define your function there is no need to distinguish between the two, the ValidMind Library will handle that for you. You simply need to add both to the function as arguments and the library will pass in the correct values.\n", + "\n", + "So for instance, if you wanted to parameterize the first custom test we created, the confusion matrix, you could do so like this:\n", + "\n", + "```python\n", + "def confusion_matrix(dataset: VMDataset, model: VMModel, my_param: str = \"Default Value\"):\n", + " pass\n", + "```\n", + "\n", + "And then when you run the test, you can pass in the parameter like this:\n", + "\n", + "```python\n", + "vm.run_test(\n", + " \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + " params={\"my_param\": \"My Value\"},\n", + ")\n", + "```\n", + "\n", + "Or if you are running the entire documentation template, you would update the config like this:\n", + "\n", + "```python\n", + "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", + " \"inputs\": {\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"model\",\n", + " },\n", + " \"params\": {\n", + " \"my_param\": \"My Value\",\n", + " },\n", + "}\n", + "```\n", + "\n", + "Let's go ahead and create a toy test that takes a parameter and uses it in the result:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import plotly.express as px\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ParameterExample\")\n", + "def parameter_example(\n", + " plot_title=\"Default Plot Title\", x_col=\"sepal_width\", y_col=\"sepal_length\"\n", + "):\n", + " \"\"\"This test takes two parameters and creates a scatter plot based on them.\n", + "\n", + " The purpose of this test is to demonstrate how to create a test that takes\n", + " parameters and uses them to generate a plot. This can be useful for creating\n", + " tests that are more flexible and can be used in a variety of scenarios.\n", + " \"\"\"\n", + " # return px.scatter(px.data.iris(), x=x_col, y=y_col, color=\"species\")\n", + " return px.scatter(\n", + " px.data.iris(), x=x_col, y=y_col, color=\"species\", title=plot_title\n", + " )\n", + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.ParameterExample\",\n", + " params={\n", + " \"plot_title\": \"My Cool Plot\",\n", + " \"x_col\": \"sepal_width\",\n", + " \"y_col\": \"sepal_length\",\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Play around with this and see how you can use parameters, default values and other features to make your custom tests more flexible and useful.\n", + "\n", + "Here's how this one looks in the documentation:\n", + "![screenshot showing parameterized test](./parameterized-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Multiple Tables and Plots in a Single Test\n", + "\n", + "Custom test functions, as stated earlier, can return more than just one table or plot. In fact, any number of tables and plots can be returned. Let's see an example of this:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import plotly.express as px\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ComplexOutput\")\n", + "def complex_output():\n", + " \"\"\"This test demonstrates how to return many tables and figures in a single test\"\"\"\n", + " # create a couple tables\n", + " table = [{\"A\": 1, \"B\": 2}, {\"A\": 3, \"B\": 4}]\n", + " table2 = [{\"C\": 5, \"D\": 6}, {\"C\": 7, \"D\": 8}]\n", + "\n", + " # create a few figures showing some random data\n", + " fig1 = px.line(x=np.arange(10), y=np.random.rand(10), title=\"Random Line Plot\")\n", + " fig2 = px.bar(x=[\"A\", \"B\", \"C\"], y=np.random.rand(3), title=\"Random Bar Plot\")\n", + " fig3 = px.scatter(\n", + " x=np.random.rand(10), y=np.random.rand(10), title=\"Random Scatter Plot\"\n", + " )\n", + "\n", + " return (\n", + " {\n", + " \"My Cool Table\": table,\n", + " \"Another Table\": table2,\n", + " },\n", + " {\n", + " # Figures support the same dict-of-titles convention as tables.\n", + " # These titles flow into the document media registry as\n", + " # \"Figure N. \" alongside table captions.\n", + " \"Random Line Plot\": fig1,\n", + " \"Random Bar Plot\": fig2,\n", + " \"Random Scatter Plot\": fig3,\n", + " },\n", + " )\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.ComplexOutput\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice how you can return the tables as a dictionary where the key is the title of the table and the value is the table itself. The same convention works for **figures** — wrap them in a dict whose keys are the titles you want shown in the document media registry (e.g. *Figure 7. Random Line Plot*). You could also just return the figures by themselves but this way you can give them a title to more easily identify them in the result.\n", + "\n", + "![screenshot showing multiple tables and plots](./multiple-tables-plots-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5__'></a>\n", + "\n", + "### Custom Test: Images\n", + "\n", + "If you are using a plotting library that isn't supported by ValidMind (i.e. not `matplotlib` or `plotly`), you can still return the image directly as a bytes-like object. This could also be used to bring any type of image into your documentation in a programmatic way. For instance, you may want to include a diagram of your model architecture or a screenshot of a dashboard that your model is integrated with. As long as you can produce the image with Python or open it from a file, you can include it in your documentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import io\n", + "import matplotlib.pyplot as plt\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.Image\")\n", + "def image():\n", + " \"\"\"This test demonstrates how to return an image in a test\"\"\"\n", + "\n", + " # create a simple plot\n", + " fig, ax = plt.subplots()\n", + " ax.plot([1, 2, 3, 4])\n", + " ax.set_title(\"Simple Line Plot\")\n", + "\n", + " # save the plot as a PNG image (in-memory buffer)\n", + " img_data = io.BytesIO()\n", + " fig.savefig(img_data, format=\"png\")\n", + " img_data.seek(0)\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return img_data.read()\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.Image\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Adding this custom test to your documentation will display the image:\n", + "\n", + "![screenshot showing image custom test](./image-in-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you want to log an image as a test result, you can do so by passing the path to the image as a parameter to the custom test and then opening the file in the test function. Here's an example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.MyPNGCorrelationMatrix\")\n", + "def Image(path: str):\n", + " \"\"\"Opens a png image file and logs it as a test result to ValidMind\"\"\"\n", + " if not path.endswith(\".png\"):\n", + " raise ValueError(\"Image must be a PNG file\")\n", + "\n", + " # return raw image bytes\n", + " with open(path, \"rb\") as f:\n", + " return f.read()\n", + " \n", + "run_test(\n", + " \"my_custom_tests.MyPNGCorrelationMatrix\",\n", + " params={\"path\": \"./pearson-correlation-matrix.png\"},\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The image is displayed in the test result:\n", + "\n", + "![screenshot showing image from file](./pearson-correlation-matrix-test-output.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_6__'></a>\n", + "\n", + "### Custom Test: Description\n", + "\n", + "If you want to write a custom test description for your custom test instead of it is interpreted through llm, you can do so by returning string in your test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.MyCustomTest\")\n", + "def my_custom_test(dataset, model):\n", + " \"\"\"\n", + " This is a custom computed test that computes confusion matrix for a binary classification model and return a string as a test description.\n", + " \"\"\"\n", + " y_true = dataset.y\n", + " y_pred = dataset.y_pred(model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return cm_display.figure_, \"Test Description - Confusion Matrix\", pd.DataFrame({\"Value\": [1, 2, 3]}) # return the figure object itself\n", + "\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see here test result description has been customized here. The same result description will be displayed in the UI." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.MyCustomTest\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Conclusion\n", + "\n", + "In this notebook, we have demonstrated how to create custom tests in ValidMind. We have shown how to define custom test functions, register them with the ValidMind Library, run them against models and datasets, and add them to model documentation templates. We have also shown how to return tables and plots from custom tests and how to use them in the ValidMind Platform. We hope this tutorial has been helpful in understanding how to create and use custom tests in ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc8_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc8_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-bcdac57ebb8d440f86ba120ee6511db3" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Implement a Custom Test](#toc3__) \n", - "- [Run the Custom Test](#toc4__) \n", - " - [Setup the Model and Dataset](#toc4_1__) \n", - " - [Run the Custom Test](#toc4_2__) \n", - "- [Adding Custom Test to Model Documentation](#toc5__) \n", - "- [Some More Custom Tests](#toc6__) \n", - " - [Custom Test: Table of Model Hyperparameters](#toc6_1__) \n", - " - [Custom Test: External API Call](#toc6_2__) \n", - " - [Custom Test: Passing Parameters](#toc6_3__) \n", - " - [Custom Test: Multiple Tables and Plots in a Single Test](#toc6_4__) \n", - " - [Custom Test: Images](#toc6_5__) \n", - " - [Custom Test: Description](#toc6_6__) \n", - "- [Conclusion](#toc7__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model\u2019s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom test can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Implement a Custom Test\n", - "\n", - "Let's start off by creating a simple custom test that creates a Confusion Matrix for a binary classification model. We will use the `sklearn.metrics.confusion_matrix` function to calculate the confusion matrix and then display it as a heatmap using `plotly`. (This is already a built-in test in ValidMind, but we will use it as an example to demonstrate how to create custom tests.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "from sklearn import metrics\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", - "def confusion_matrix(dataset, model):\n", - " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", - "\n", - " The confusion matrix is a 2x2 table that contains 4 values:\n", - "\n", - " - True Positive (TP): the number of correct positive predictions\n", - " - True Negative (TN): the number of correct negative predictions\n", - " - False Positive (FP): the number of incorrect positive predictions\n", - " - False Negative (FN): the number of incorrect negative predictions\n", - "\n", - " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", - " \"\"\"\n", - " y_true = dataset.y\n", - " y_pred = dataset.y_pred(model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return cm_display.figure_ # return the figure object itself" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Thats our custom test defined and ready to go... Let's take a look at whats going on here:\n", - "\n", - "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a VMDataset and VMModel object respectively.\n", - "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", - "- The function body calculates the confusion matrix using the `sklearn.metrics.confusion_matrix` function and then plots it using `sklearn.metric.ConfusionMatrixDisplay`.\n", - "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", - "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Run the Custom Test\n", - "\n", - "Now that we have defined and registered our custom test, lets see how we can run it and properly use it in the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Setup the Model and Dataset\n", - "\n", - "First let's setup a an example model and dataset to run our custom metic against. Since this is a Confusion Matrix, we will use the Customer Churn dataset that ValidMind provides and train a simple XGBoost model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Easy enough! Now we have a model and dataset setup and trained. One last thing to do is bring the dataset and model into the ValidMind Library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# for now, we'll just use the test dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " target_column=customer_churn.target_column,\n", - " input_id=\"test_dataset\",\n", - ")\n", - "\n", - "vm_model = vm.init_model(model, input_id=\"model\")\n", - "\n", - "# link the model to the dataset\n", - "vm_test_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Run the Custom Test\n", - "\n", - "Now that we have our model and dataset setup, we have everything we need to run our custom test. We can do this by importing the `run_test` function from the `validmind.tests` module and passing in the test ID of our custom test along with the model and dataset we want to run it against.\n", - "\n", - ">Notice how the `inputs` dictionary is used to map an `input_id` which we set above to the `model` and `dataset` keys that are expected by our custom test function. This is how the ValidMind Library knows which inputs to pass to different tests and is key when using many different datasets and models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You'll notice that the docstring becomes a markdown description of the test. The figure is then displayed as the test result. What you see above is how it will look in the ValidMind Platform as well. Let's go ahead and log the result to see how that works." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Adding Custom Test to Model Documentation\n", - "\n", - "To do this, go to the documentation page of the model you registered above and navigate to the `Model Development` -> `Model Evaluation` section. Then hover between any existing content block to reveal the `+` button as shown in the screenshot below.\n", - "\n", - "![screenshot showing insert button for test-driven blocks](./insert-test-driven-block.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now click on the `+` button and select the `Test-Driven Block` option. This will open a dialog where you can select `My Custom Tests Confusion Matrix` from the list of available tests. You can preview the result and then click `Insert Block` to add it to the documentation.\n", - "\n", - "![screenshot showing how to insert a test-driven block](./insert-test-driven-block-custom.png)\n", - "\n", - "The test should match the result you see above. It is now part of your documentation and will now be run everytime you run `vm.run_documentation_tests()` for your model. Let's do that now." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.reload()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you preview the template, it should show the custom test in the `Model Development`->`Model Evaluation` section:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Just so we can run all of the tests in the template, let's initialize the train and raw dataset.\n", - "\n", - "(Refer to [**Quickstart for documentation**](../../../quickstart/quickstart_documentation.ipynb) and the ValidMind docs for more information on what we are doing here)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "vm_train_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To run all the tests in the template, you can use the `vm.run_documentation_tests()` and pass the inputs we initialized above and the demo config from our customer_churn module. We will have to add a section to the config for our new test to tell it which inputs it should receive. This is done by simply adding a new element in the config dictionary where the key is the ID of the test and the value is a dictionary with the following structure:\n", - "```python\n", - "{\n", - " \"inputs\": {\n", - " \"model\": \"test_dataset\",\n", - " \"dataset\": \"model\",\n", - " }\n", - "}\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", - " \"inputs\": {\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"model\",\n", - " }\n", - "}\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Some More Custom Tests\n", - "\n", - "Now that you understand the entire process of creating custom tests and using them in your documentation, let's create a few more to see different ways you can utilize custom tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Custom Test: Table of Model Hyperparameters\n", - "\n", - "This custom test will display a table of the hyperparameters used in the model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Hyperparameters\")\n", - "def hyperparameters(model):\n", - " \"\"\"The hyperparameters of a machine learning model are the settings that control the learning process.\n", - " These settings are specified before the learning process begins and can have a significant impact on the\n", - " performance of the model.\n", - "\n", - " The hyperparameters of a model can be used to tune the model to achieve the best possible performance\n", - " on a given dataset. By examining the hyperparameters of a model, you can gain insight into how the model\n", - " was trained and how it might be improved.\n", - " \"\"\"\n", - " hyperparameters = model.model.get_xgb_params() # dictionary of hyperparameters\n", - "\n", - " # turn the dictionary into a table where each row contains a hyperparameter and its value\n", - " return [{\"Hyperparam\": k, \"Value\": v} for k, v in hyperparameters.items() if v]\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.Hyperparameters\", inputs={\"model\": \"model\"})\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since the test has been run and logged, you can add it to your documentation using the same process as above. It should look like this:\n", - "\n", - "![screenshot showing hyperparameters test](./hyperparameters-custom-metric.png)\n", - "\n", - "For our simple toy model, there are aren't really any proper hyperparameters but you can see how this could be useful for more complex models that have gone through hyperparameter tuning." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Custom Test: External API Call\n", - "\n", - "This custom test will make an external API call to get the current BTC price and display it as a table. This demonstrates how you might integrate external data sources into your model documentation in a programmatic way. You could, for instance, setup a pipeline that runs a test like this every day to keep your model documentation in sync with an external system." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "import random\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ExternalAPI\")\n", - "def external_api():\n", - " \"\"\"This test calls an external API to get a list of fake users. It then creates\n", - " a table with the relevant data so it can be displayed in the documentation.\n", - "\n", - " The purpose of this test is to demonstrate how to call an external API and use the\n", - " data in a test. A test like this could even be setup to run in a scheduled\n", - " pipeline to keep your documentation in-sync with an external data source.\n", - " \"\"\"\n", - " url = \"https://jsonplaceholder.typicode.com/users\"\n", - " response = requests.get(url)\n", - " data = response.json()\n", - "\n", - " # extract the time and the current BTC price in USD\n", - " return {\n", - " \"Model Owners/Stakeholders\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": random.choice([\"Owner\", \"Stakeholder\"]),\n", - " \"Email\": user[\"email\"],\n", - " \"Phone\": user[\"phone\"],\n", - " \"Slack Handle\": f\"@{user['name'].lower().replace(' ', '.')}\",\n", - " }\n", - " for user in data[:3]\n", - " ],\n", - " \"Model Developers\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": \"Developer\",\n", - " \"Email\": user[\"email\"],\n", - " }\n", - " for user in data[3:7]\n", - " ],\n", - " \"Model Validators\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": \"Validator\",\n", - " \"Email\": user[\"email\"],\n", - " }\n", - " for user in data[7:]\n", - " ],\n", - " }\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.ExternalAPI\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Again, you can add this to your documentation to see how it looks:\n", - "\n", - "![screenshot showing BTC price metric](./external-data-custom-test.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Custom Test: Passing Parameters\n", - "\n", - "Custom test functions, as stated earlier, can take both inputs and params. When you define your function there is no need to distinguish between the two, the ValidMind Library will handle that for you. You simply need to add both to the function as arguments and the library will pass in the correct values.\n", - "\n", - "So for instance, if you wanted to parameterize the first custom test we created, the confusion matrix, you could do so like this:\n", - "\n", - "```python\n", - "def confusion_matrix(dataset: VMDataset, model: VMModel, my_param: str = \"Default Value\"):\n", - " pass\n", - "```\n", - "\n", - "And then when you run the test, you can pass in the parameter like this:\n", - "\n", - "```python\n", - "vm.run_test(\n", - " \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - " params={\"my_param\": \"My Value\"},\n", - ")\n", - "```\n", - "\n", - "Or if you are running the entire documentation template, you would update the config like this:\n", - "\n", - "```python\n", - "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", - " \"inputs\": {\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"model\",\n", - " },\n", - " \"params\": {\n", - " \"my_param\": \"My Value\",\n", - " },\n", - "}\n", - "```\n", - "\n", - "Let's go ahead and create a toy test that takes a parameter and uses it in the result:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import plotly.express as px\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ParameterExample\")\n", - "def parameter_example(\n", - " plot_title=\"Default Plot Title\", x_col=\"sepal_width\", y_col=\"sepal_length\"\n", - "):\n", - " \"\"\"This test takes two parameters and creates a scatter plot based on them.\n", - "\n", - " The purpose of this test is to demonstrate how to create a test that takes\n", - " parameters and uses them to generate a plot. This can be useful for creating\n", - " tests that are more flexible and can be used in a variety of scenarios.\n", - " \"\"\"\n", - " # return px.scatter(px.data.iris(), x=x_col, y=y_col, color=\"species\")\n", - " return px.scatter(\n", - " px.data.iris(), x=x_col, y=y_col, color=\"species\", title=plot_title\n", - " )\n", - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.ParameterExample\",\n", - " params={\n", - " \"plot_title\": \"My Cool Plot\",\n", - " \"x_col\": \"sepal_width\",\n", - " \"y_col\": \"sepal_length\",\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Play around with this and see how you can use parameters, default values and other features to make your custom tests more flexible and useful.\n", - "\n", - "Here's how this one looks in the documentation:\n", - "![screenshot showing parameterized test](./parameterized-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Custom Test: Multiple Tables and Plots in a Single Test\n", - "\n", - "Custom test functions, as stated earlier, can return more than just one table or plot. In fact, any number of tables and plots can be returned. Let's see an example of this:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import plotly.express as px\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ComplexOutput\")\n", - "def complex_output():\n", - " \"\"\"This test demonstrates how to return many tables and figures in a single test\"\"\"\n", - " # create a couple tables\n", - " table = [{\"A\": 1, \"B\": 2}, {\"A\": 3, \"B\": 4}]\n", - " table2 = [{\"C\": 5, \"D\": 6}, {\"C\": 7, \"D\": 8}]\n", - "\n", - " # create a few figures showing some random data\n", - " fig1 = px.line(x=np.arange(10), y=np.random.rand(10), title=\"Random Line Plot\")\n", - " fig2 = px.bar(x=[\"A\", \"B\", \"C\"], y=np.random.rand(3), title=\"Random Bar Plot\")\n", - " fig3 = px.scatter(\n", - " x=np.random.rand(10), y=np.random.rand(10), title=\"Random Scatter Plot\"\n", - " )\n", - "\n", - " return (\n", - " {\n", - " \"My Cool Table\": table,\n", - " \"Another Table\": table2,\n", - " },\n", - " {\n", - " # Figures support the same dict-of-titles convention as tables.\n", - " # These titles flow into the document media registry as\n", - " # \"Figure N. <title>\" alongside table captions.\n", - " \"Random Line Plot\": fig1,\n", - " \"Random Bar Plot\": fig2,\n", - " \"Random Scatter Plot\": fig3,\n", - " },\n", - " )\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.ComplexOutput\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Notice how you can return the tables as a dictionary where the key is the title of the table and the value is the table itself. The same convention works for **figures** \u2014 wrap them in a dict whose keys are the titles you want shown in the document media registry (e.g. *Figure 7. Random Line Plot*). You could also just return the figures by themselves but this way you can give them a title to more easily identify them in the result.\n", - "\n", - "![screenshot showing multiple tables and plots](./multiple-tables-plots-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5__'></a>\n", - "\n", - "### Custom Test: Images\n", - "\n", - "If you are using a plotting library that isn't supported by ValidMind (i.e. not `matplotlib` or `plotly`), you can still return the image directly as a bytes-like object. This could also be used to bring any type of image into your documentation in a programmatic way. For instance, you may want to include a diagram of your model architecture or a screenshot of a dashboard that your model is integrated with. As long as you can produce the image with Python or open it from a file, you can include it in your documentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import io\n", - "import matplotlib.pyplot as plt\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.Image\")\n", - "def image():\n", - " \"\"\"This test demonstrates how to return an image in a test\"\"\"\n", - "\n", - " # create a simple plot\n", - " fig, ax = plt.subplots()\n", - " ax.plot([1, 2, 3, 4])\n", - " ax.set_title(\"Simple Line Plot\")\n", - "\n", - " # save the plot as a PNG image (in-memory buffer)\n", - " img_data = io.BytesIO()\n", - " fig.savefig(img_data, format=\"png\")\n", - " img_data.seek(0)\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return img_data.read()\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.Image\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Adding this custom test to your documentation will display the image:\n", - "\n", - "![screenshot showing image custom test](./image-in-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you want to log an image as a test result, you can do so by passing the path to the image as a parameter to the custom test and then opening the file in the test function. Here's an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.MyPNGCorrelationMatrix\")\n", - "def Image(path: str):\n", - " \"\"\"Opens a png image file and logs it as a test result to ValidMind\"\"\"\n", - " if not path.endswith(\".png\"):\n", - " raise ValueError(\"Image must be a PNG file\")\n", - "\n", - " # return raw image bytes\n", - " with open(path, \"rb\") as f:\n", - " return f.read()\n", - " \n", - "run_test(\n", - " \"my_custom_tests.MyPNGCorrelationMatrix\",\n", - " params={\"path\": \"./pearson-correlation-matrix.png\"},\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The image is displayed in the test result:\n", - "\n", - "![screenshot showing image from file](./pearson-correlation-matrix-test-output.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_6__'></a>\n", - "\n", - "### Custom Test: Description\n", - "\n", - "If you want to write a custom test description for your custom test instead of it is interpreted through llm, you can do so by returning string in your test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.MyCustomTest\")\n", - "def my_custom_test(dataset, model):\n", - " \"\"\"\n", - " This is a custom computed test that computes confusion matrix for a binary classification model and return a string as a test description.\n", - " \"\"\"\n", - " y_true = dataset.y\n", - " y_pred = dataset.y_pred(model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return cm_display.figure_, \"Test Description - Confusion Matrix\", pd.DataFrame({\"Value\": [1, 2, 3]}) # return the figure object itself\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can see here test result description has been customized here. The same result description will be displayed in the UI." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.MyCustomTest\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Conclusion\n", - "\n", - "In this notebook, we have demonstrated how to create custom tests in ValidMind. We have shown how to define custom test functions, register them with the ValidMind Library, run them against models and datasets, and add them to model documentation templates. We have also shown how to return tables and plots from custom tests and how to use them in the ValidMind Platform. We hope this tutorial has been helpful in understanding how to create and use custom tests in ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way \u2014 use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc8_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc8_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you\u2019ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-bcdac57ebb8d440f86ba120ee6511db3", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright \u00a9 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.5" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} + "nbformat": 4, + "nbformat_minor": 4 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index e3a7a3b94..61591131e 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -1,931 +1,932 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Explore test suites\n", - "\n", - "Explore ValidMind test suites, pre-built collections of related tests used to evaluate specific aspects of your model. Retrieve available test suites and details for tests within a suite to understand their functionality, allowing you to select the appropriate test suites for your use cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Install the ValidMind Library](#toc2__) \n", - "- [List available test suites](#toc3__) \n", - "- [View test suite details](#toc4__) \n", - " - [View test details](#toc4_1__) \n", - "- [Next steps](#toc5__) \n", - " - [Discover more learning resources](#toc5_1__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## List available test suites\n", - "After we import the ValidMind Library, we'll call [test_suites.list_suites()](https://docs.validmind.ai/validmind/validmind/test_suites.html#list_suites) to retrieve a structured list of all available test suites, that includes each suite's name, description, and associated tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Explore test suites\n", + "\n", + "Explore ValidMind test suites, pre-built collections of related tests used to evaluate specific aspects of your model. Retrieve available test suites and details for tests within a suite to understand their functionality, allowing you to select the appropriate test suites for your use cases." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Install the ValidMind Library](#toc2__) \n", + "- [List available test suites](#toc3__) \n", + "- [View test suite details](#toc4__) \n", + " - [View test details](#toc4_1__) \n", + "- [Next steps](#toc5__) \n", + " - [Discover more learning resources](#toc5_1__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_9e889 th {\n", - " text-align: left;\n", - "}\n", - "#T_9e889_row0_col0, #T_9e889_row0_col1, #T_9e889_row0_col2, #T_9e889_row0_col3, #T_9e889_row1_col0, #T_9e889_row1_col1, #T_9e889_row1_col2, #T_9e889_row1_col3, #T_9e889_row2_col0, #T_9e889_row2_col1, #T_9e889_row2_col2, #T_9e889_row2_col3, #T_9e889_row3_col0, #T_9e889_row3_col1, #T_9e889_row3_col2, #T_9e889_row3_col3, #T_9e889_row4_col0, #T_9e889_row4_col1, #T_9e889_row4_col2, #T_9e889_row4_col3, #T_9e889_row5_col0, #T_9e889_row5_col1, #T_9e889_row5_col2, #T_9e889_row5_col3, #T_9e889_row6_col0, #T_9e889_row6_col1, #T_9e889_row6_col2, #T_9e889_row6_col3, #T_9e889_row7_col0, #T_9e889_row7_col1, #T_9e889_row7_col2, #T_9e889_row7_col3, #T_9e889_row8_col0, #T_9e889_row8_col1, #T_9e889_row8_col2, #T_9e889_row8_col3, #T_9e889_row9_col0, #T_9e889_row9_col1, #T_9e889_row9_col2, #T_9e889_row9_col3, #T_9e889_row10_col0, #T_9e889_row10_col1, #T_9e889_row10_col2, #T_9e889_row10_col3, #T_9e889_row11_col0, #T_9e889_row11_col1, #T_9e889_row11_col2, #T_9e889_row11_col3, #T_9e889_row12_col0, #T_9e889_row12_col1, #T_9e889_row12_col2, #T_9e889_row12_col3, #T_9e889_row13_col0, #T_9e889_row13_col1, #T_9e889_row13_col2, #T_9e889_row13_col3, #T_9e889_row14_col0, #T_9e889_row14_col1, #T_9e889_row14_col2, #T_9e889_row14_col3, #T_9e889_row15_col0, #T_9e889_row15_col1, #T_9e889_row15_col2, #T_9e889_row15_col3, #T_9e889_row16_col0, #T_9e889_row16_col1, #T_9e889_row16_col2, #T_9e889_row16_col3, #T_9e889_row17_col0, #T_9e889_row17_col1, #T_9e889_row17_col2, #T_9e889_row17_col3, #T_9e889_row18_col0, #T_9e889_row18_col1, #T_9e889_row18_col2, #T_9e889_row18_col3, #T_9e889_row19_col0, #T_9e889_row19_col1, #T_9e889_row19_col2, #T_9e889_row19_col3, #T_9e889_row20_col0, #T_9e889_row20_col1, #T_9e889_row20_col2, #T_9e889_row20_col3, #T_9e889_row21_col0, #T_9e889_row21_col1, #T_9e889_row21_col2, #T_9e889_row21_col3, #T_9e889_row22_col0, #T_9e889_row22_col1, #T_9e889_row22_col2, #T_9e889_row22_col3, #T_9e889_row23_col0, #T_9e889_row23_col1, #T_9e889_row23_col2, #T_9e889_row23_col3, #T_9e889_row24_col0, #T_9e889_row24_col1, #T_9e889_row24_col2, #T_9e889_row24_col3, #T_9e889_row25_col0, #T_9e889_row25_col1, #T_9e889_row25_col2, #T_9e889_row25_col3, #T_9e889_row26_col0, #T_9e889_row26_col1, #T_9e889_row26_col2, #T_9e889_row26_col3, #T_9e889_row27_col0, #T_9e889_row27_col1, #T_9e889_row27_col2, #T_9e889_row27_col3, #T_9e889_row28_col0, #T_9e889_row28_col1, #T_9e889_row28_col2, #T_9e889_row28_col3, #T_9e889_row29_col0, #T_9e889_row29_col1, #T_9e889_row29_col2, #T_9e889_row29_col3 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_9e889\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_9e889_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_9e889_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_9e889_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_9e889_level0_col3\" class=\"col_heading level0 col3\" >Tests</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_9e889_row0_col0\" class=\"data row0 col0\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_9e889_row0_col1\" class=\"data row0 col1\" >ClassifierDiagnosis</td>\n", - " <td id=\"T_9e889_row0_col2\" class=\"data row0 col2\" >Test suite for sklearn classifier model diagnosis tests</td>\n", - " <td id=\"T_9e889_row0_col3\" class=\"data row0 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_9e889_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row1_col2\" class=\"data row1 col2\" >Full test suite for binary classification models.</td>\n", - " <td id=\"T_9e889_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row2_col0\" class=\"data row2 col0\" >classifier_metrics</td>\n", - " <td id=\"T_9e889_row2_col1\" class=\"data row2 col1\" >ClassifierMetrics</td>\n", - " <td id=\"T_9e889_row2_col2\" class=\"data row2 col2\" >Test suite for sklearn classifier metrics</td>\n", - " <td id=\"T_9e889_row2_col3\" class=\"data row2 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row3_col0\" class=\"data row3 col0\" >classifier_model_validation</td>\n", - " <td id=\"T_9e889_row3_col1\" class=\"data row3 col1\" >ClassifierModelValidation</td>\n", - " <td id=\"T_9e889_row3_col2\" class=\"data row3 col2\" >Test suite for binary classification models.</td>\n", - " <td id=\"T_9e889_row3_col3\" class=\"data row3 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row4_col0\" class=\"data row4 col0\" >classifier_validation</td>\n", - " <td id=\"T_9e889_row4_col1\" class=\"data row4 col1\" >ClassifierPerformance</td>\n", - " <td id=\"T_9e889_row4_col2\" class=\"data row4 col2\" >Test suite for sklearn classifier models</td>\n", - " <td id=\"T_9e889_row4_col3\" class=\"data row4 col3\" >validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row5_col0\" class=\"data row5 col0\" >cluster_full_suite</td>\n", - " <td id=\"T_9e889_row5_col1\" class=\"data row5 col1\" >ClusterFullSuite</td>\n", - " <td id=\"T_9e889_row5_col2\" class=\"data row5 col2\" >Full test suite for clustering models.</td>\n", - " <td id=\"T_9e889_row5_col3\" class=\"data row5 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot, validmind.model_validation.ClusterSizeDistribution, validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row6_col0\" class=\"data row6 col0\" >cluster_metrics</td>\n", - " <td id=\"T_9e889_row6_col1\" class=\"data row6 col1\" >ClusterMetrics</td>\n", - " <td id=\"T_9e889_row6_col2\" class=\"data row6 col2\" >Test suite for sklearn clustering metrics</td>\n", - " <td id=\"T_9e889_row6_col3\" class=\"data row6 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row7_col0\" class=\"data row7 col0\" >cluster_performance</td>\n", - " <td id=\"T_9e889_row7_col1\" class=\"data row7 col1\" >ClusterPerformance</td>\n", - " <td id=\"T_9e889_row7_col2\" class=\"data row7 col2\" >Test suite for sklearn cluster performance</td>\n", - " <td id=\"T_9e889_row7_col3\" class=\"data row7 col3\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row8_col0\" class=\"data row8 col0\" >embeddings_full_suite</td>\n", - " <td id=\"T_9e889_row8_col1\" class=\"data row8 col1\" >EmbeddingsFullSuite</td>\n", - " <td id=\"T_9e889_row8_col2\" class=\"data row8 col2\" >Full test suite for embeddings models.</td>\n", - " <td id=\"T_9e889_row8_col3\" class=\"data row8 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D, validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row9_col0\" class=\"data row9 col0\" >embeddings_metrics</td>\n", - " <td id=\"T_9e889_row9_col1\" class=\"data row9 col1\" >EmbeddingsMetrics</td>\n", - " <td id=\"T_9e889_row9_col2\" class=\"data row9 col2\" >Test suite for embeddings metrics</td>\n", - " <td id=\"T_9e889_row9_col3\" class=\"data row9 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row10_col0\" class=\"data row10 col0\" >embeddings_model_performance</td>\n", - " <td id=\"T_9e889_row10_col1\" class=\"data row10 col1\" >EmbeddingsPerformance</td>\n", - " <td id=\"T_9e889_row10_col2\" class=\"data row10 col2\" >Test suite for embeddings model performance</td>\n", - " <td id=\"T_9e889_row10_col3\" class=\"data row10 col3\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row11_col0\" class=\"data row11 col0\" >hyper_parameters_optimization</td>\n", - " <td id=\"T_9e889_row11_col1\" class=\"data row11 col1\" >KmeansParametersOptimization</td>\n", - " <td id=\"T_9e889_row11_col2\" class=\"data row11 col2\" >Test suite for sklearn hyperparameters optimization</td>\n", - " <td id=\"T_9e889_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row12_col0\" class=\"data row12 col0\" >llm_classifier_full_suite</td>\n", - " <td id=\"T_9e889_row12_col1\" class=\"data row12 col1\" >LLMClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row12_col2\" class=\"data row12 col2\" >Full test suite for LLM classification models.</td>\n", - " <td id=\"T_9e889_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis, validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row13_col0\" class=\"data row13 col0\" >prompt_validation</td>\n", - " <td id=\"T_9e889_row13_col1\" class=\"data row13 col1\" >PromptValidation</td>\n", - " <td id=\"T_9e889_row13_col2\" class=\"data row13 col2\" >Test suite for prompt validation</td>\n", - " <td id=\"T_9e889_row13_col3\" class=\"data row13 col3\" >validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row14_col0\" class=\"data row14 col0\" >nlp_classifier_full_suite</td>\n", - " <td id=\"T_9e889_row14_col1\" class=\"data row14 col1\" >NLPClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row14_col2\" class=\"data row14 col2\" >Full test suite for NLP classification models.</td>\n", - " <td id=\"T_9e889_row14_col3\" class=\"data row14 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row15_col0\" class=\"data row15 col0\" >regression_metrics</td>\n", - " <td id=\"T_9e889_row15_col1\" class=\"data row15 col1\" >RegressionMetrics</td>\n", - " <td id=\"T_9e889_row15_col2\" class=\"data row15 col2\" >Test suite for performance metrics of regression metrics</td>\n", - " <td id=\"T_9e889_row15_col3\" class=\"data row15 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row16_col0\" class=\"data row16 col0\" >regression_model_description</td>\n", - " <td id=\"T_9e889_row16_col1\" class=\"data row16 col1\" >RegressionModelDescription</td>\n", - " <td id=\"T_9e889_row16_col2\" class=\"data row16 col2\" >Test suite for performance metric of regression model of statsmodels library</td>\n", - " <td id=\"T_9e889_row16_col3\" class=\"data row16 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row17_col0\" class=\"data row17 col0\" >regression_models_evaluation</td>\n", - " <td id=\"T_9e889_row17_col1\" class=\"data row17 col1\" >RegressionModelsEvaluation</td>\n", - " <td id=\"T_9e889_row17_col2\" class=\"data row17 col2\" >Test suite for metrics comparison of regression model of statsmodels library</td>\n", - " <td id=\"T_9e889_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row18_col0\" class=\"data row18 col0\" >regression_full_suite</td>\n", - " <td id=\"T_9e889_row18_col1\" class=\"data row18 col1\" >RegressionFullSuite</td>\n", - " <td id=\"T_9e889_row18_col2\" class=\"data row18 col2\" >Full test suite for regression models.</td>\n", - " <td id=\"T_9e889_row18_col3\" class=\"data row18 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row19_col0\" class=\"data row19 col0\" >regression_performance</td>\n", - " <td id=\"T_9e889_row19_col1\" class=\"data row19 col1\" >RegressionPerformance</td>\n", - " <td id=\"T_9e889_row19_col2\" class=\"data row19 col2\" >Test suite for regression model performance</td>\n", - " <td id=\"T_9e889_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row20_col0\" class=\"data row20 col0\" >summarization_metrics</td>\n", - " <td id=\"T_9e889_row20_col1\" class=\"data row20 col1\" >SummarizationMetrics</td>\n", - " <td id=\"T_9e889_row20_col2\" class=\"data row20 col2\" >Test suite for Summarization metrics</td>\n", - " <td id=\"T_9e889_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.TokenDisparity, validmind.model_validation.BleuScore, validmind.model_validation.BertScore, validmind.model_validation.ContextualRecall</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row21_col0\" class=\"data row21 col0\" >tabular_dataset</td>\n", - " <td id=\"T_9e889_row21_col1\" class=\"data row21 col1\" >TabularDataset</td>\n", - " <td id=\"T_9e889_row21_col2\" class=\"data row21 col2\" >Test suite for tabular datasets.</td>\n", - " <td id=\"T_9e889_row21_col3\" class=\"data row21 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row22_col0\" class=\"data row22 col0\" >tabular_dataset_description</td>\n", - " <td id=\"T_9e889_row22_col1\" class=\"data row22 col1\" >TabularDatasetDescription</td>\n", - " <td id=\"T_9e889_row22_col2\" class=\"data row22 col2\" >Test suite to extract metadata and descriptive\n", - "statistics from a tabular dataset</td>\n", - " <td id=\"T_9e889_row22_col3\" class=\"data row22 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row23_col0\" class=\"data row23 col0\" >tabular_data_quality</td>\n", - " <td id=\"T_9e889_row23_col1\" class=\"data row23 col1\" >TabularDataQuality</td>\n", - " <td id=\"T_9e889_row23_col2\" class=\"data row23 col2\" >Test suite for data quality on tabular datasets</td>\n", - " <td id=\"T_9e889_row23_col3\" class=\"data row23 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row24_col0\" class=\"data row24 col0\" >text_data_quality</td>\n", - " <td id=\"T_9e889_row24_col1\" class=\"data row24 col1\" >TextDataQuality</td>\n", - " <td id=\"T_9e889_row24_col2\" class=\"data row24 col2\" >Test suite for data quality on text data</td>\n", - " <td id=\"T_9e889_row24_col3\" class=\"data row24 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row25_col0\" class=\"data row25 col0\" >time_series_data_quality</td>\n", - " <td id=\"T_9e889_row25_col1\" class=\"data row25 col1\" >TimeSeriesDataQuality</td>\n", - " <td id=\"T_9e889_row25_col2\" class=\"data row25 col2\" >Test suite for data quality on time series datasets</td>\n", - " <td id=\"T_9e889_row25_col3\" class=\"data row25 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row26_col0\" class=\"data row26 col0\" >time_series_dataset</td>\n", - " <td id=\"T_9e889_row26_col1\" class=\"data row26 col1\" >TimeSeriesDataset</td>\n", - " <td id=\"T_9e889_row26_col2\" class=\"data row26 col2\" >Test suite for time series datasets.</td>\n", - " <td id=\"T_9e889_row26_col3\" class=\"data row26 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency, validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA, validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row27_col0\" class=\"data row27 col0\" >time_series_model_validation</td>\n", - " <td id=\"T_9e889_row27_col1\" class=\"data row27 col1\" >TimeSeriesModelValidation</td>\n", - " <td id=\"T_9e889_row27_col2\" class=\"data row27 col2\" >Test suite for time series model validation.</td>\n", - " <td id=\"T_9e889_row27_col3\" class=\"data row27 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row28_col0\" class=\"data row28 col0\" >time_series_multivariate</td>\n", - " <td id=\"T_9e889_row28_col1\" class=\"data row28 col1\" >TimeSeriesMultivariate</td>\n", - " <td id=\"T_9e889_row28_col2\" class=\"data row28 col2\" >This test suite provides a preliminary understanding of the features\n", - "and relationship in multivariate dataset. It presents various\n", - "multivariate visualizations that can help identify patterns, trends,\n", - "and relationships between pairs of variables. The visualizations are\n", - "designed to explore the relationships between multiple features\n", - "simultaneously. They allow you to quickly identify any patterns or\n", - "trends in the data, as well as any potential outliers or anomalies.\n", - "The individual feature distribution can also be explored to provide\n", - "insight into the range and frequency of values observed in the data.\n", - "This multivariate analysis test suite aims to provide an overview of\n", - "the data structure and guide further exploration and modeling.</td>\n", - " <td id=\"T_9e889_row28_col3\" class=\"data row28 col3\" >validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row29_col0\" class=\"data row29 col0\" >time_series_univariate</td>\n", - " <td id=\"T_9e889_row29_col1\" class=\"data row29 col1\" >TimeSeriesUnivariate</td>\n", - " <td id=\"T_9e889_row29_col2\" class=\"data row29 col2\" >This test suite provides a preliminary understanding of the target variable(s)\n", - "used in the time series dataset. It visualizations that present the raw time\n", - "series data and a histogram of the target variable(s).\n", - "\n", - "The raw time series data provides a visual inspection of the target variable's\n", - "behavior over time. This helps to identify any patterns or trends in the data,\n", - "as well as any potential outliers or anomalies. The histogram of the target\n", - "variable displays the distribution of values, providing insight into the range\n", - "and frequency of values observed in the data.</td>\n", - " <td id=\"T_9e889_row29_col3\" class=\"data row29 col3\" >validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x16a11ae00>" + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## List available test suites\n", + "After we import the ValidMind Library, we'll call [test_suites.list_suites()](https://docs.validmind.ai/validmind/validmind/test_suites.html#list_suites) to retrieve a structured list of all available test suites, that includes each suite's name, description, and associated tests:" ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import validmind as vm\n", - "\n", - "vm.test_suites.list_suites()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## View test suite details\n", - "\n", - "Use the [test_suites.describe_suite()](https://docs.validmind.ai/validmind/validmind/test_suites.html#describe_suite) function to retrieve information about a test suite, including its name, description, and the list of tests it contains. \n", - "\n", - "You can call `test_suites.describe_suite()` with just the test suite ID to get basic details, or pass an additional `verbose` parameter for a more comprehensive output: \n", - "\n", - "- **Test ID** - The identifier of the test suite you want to inspect.\n", - "- **Verbose** - A Boolean flag. Set `verbose=True` to return a full breakdown of the test suite." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_7cb1b th {\n", - " text-align: left;\n", - "}\n", - "#T_7cb1b_row0_col0, #T_7cb1b_row0_col1, #T_7cb1b_row0_col2, #T_7cb1b_row0_col3, #T_7cb1b_row0_col4, #T_7cb1b_row1_col0, #T_7cb1b_row1_col1, #T_7cb1b_row1_col2, #T_7cb1b_row1_col3, #T_7cb1b_row1_col4, #T_7cb1b_row2_col0, #T_7cb1b_row2_col1, #T_7cb1b_row2_col2, #T_7cb1b_row2_col3, #T_7cb1b_row2_col4, #T_7cb1b_row3_col0, #T_7cb1b_row3_col1, #T_7cb1b_row3_col2, #T_7cb1b_row3_col3, #T_7cb1b_row3_col4, #T_7cb1b_row4_col0, #T_7cb1b_row4_col1, #T_7cb1b_row4_col2, #T_7cb1b_row4_col3, #T_7cb1b_row4_col4, #T_7cb1b_row5_col0, #T_7cb1b_row5_col1, #T_7cb1b_row5_col2, #T_7cb1b_row5_col3, #T_7cb1b_row5_col4, #T_7cb1b_row6_col0, #T_7cb1b_row6_col1, #T_7cb1b_row6_col2, #T_7cb1b_row6_col3, #T_7cb1b_row6_col4, #T_7cb1b_row7_col0, #T_7cb1b_row7_col1, #T_7cb1b_row7_col2, #T_7cb1b_row7_col3, #T_7cb1b_row7_col4, #T_7cb1b_row8_col0, #T_7cb1b_row8_col1, #T_7cb1b_row8_col2, #T_7cb1b_row8_col3, #T_7cb1b_row8_col4, #T_7cb1b_row9_col0, #T_7cb1b_row9_col1, #T_7cb1b_row9_col2, #T_7cb1b_row9_col3, #T_7cb1b_row9_col4, #T_7cb1b_row10_col0, #T_7cb1b_row10_col1, #T_7cb1b_row10_col2, #T_7cb1b_row10_col3, #T_7cb1b_row10_col4, #T_7cb1b_row11_col0, #T_7cb1b_row11_col1, #T_7cb1b_row11_col2, #T_7cb1b_row11_col3, #T_7cb1b_row11_col4, #T_7cb1b_row12_col0, #T_7cb1b_row12_col1, #T_7cb1b_row12_col2, #T_7cb1b_row12_col3, #T_7cb1b_row12_col4, #T_7cb1b_row13_col0, #T_7cb1b_row13_col1, #T_7cb1b_row13_col2, #T_7cb1b_row13_col3, #T_7cb1b_row13_col4, #T_7cb1b_row14_col0, #T_7cb1b_row14_col1, #T_7cb1b_row14_col2, #T_7cb1b_row14_col3, #T_7cb1b_row14_col4, #T_7cb1b_row15_col0, #T_7cb1b_row15_col1, #T_7cb1b_row15_col2, #T_7cb1b_row15_col3, #T_7cb1b_row15_col4, #T_7cb1b_row16_col0, #T_7cb1b_row16_col1, #T_7cb1b_row16_col2, #T_7cb1b_row16_col3, #T_7cb1b_row16_col4, #T_7cb1b_row17_col0, #T_7cb1b_row17_col1, #T_7cb1b_row17_col2, #T_7cb1b_row17_col3, #T_7cb1b_row17_col4, #T_7cb1b_row18_col0, #T_7cb1b_row18_col1, #T_7cb1b_row18_col2, #T_7cb1b_row18_col3, #T_7cb1b_row18_col4, #T_7cb1b_row19_col0, #T_7cb1b_row19_col1, #T_7cb1b_row19_col2, #T_7cb1b_row19_col3, #T_7cb1b_row19_col4, #T_7cb1b_row20_col0, #T_7cb1b_row20_col1, #T_7cb1b_row20_col2, #T_7cb1b_row20_col3, #T_7cb1b_row20_col4, #T_7cb1b_row21_col0, #T_7cb1b_row21_col1, #T_7cb1b_row21_col2, #T_7cb1b_row21_col3, #T_7cb1b_row21_col4, #T_7cb1b_row22_col0, #T_7cb1b_row22_col1, #T_7cb1b_row22_col2, #T_7cb1b_row22_col3, #T_7cb1b_row22_col4, #T_7cb1b_row23_col0, #T_7cb1b_row23_col1, #T_7cb1b_row23_col2, #T_7cb1b_row23_col3, #T_7cb1b_row23_col4, #T_7cb1b_row24_col0, #T_7cb1b_row24_col1, #T_7cb1b_row24_col2, #T_7cb1b_row24_col3, #T_7cb1b_row24_col4, #T_7cb1b_row25_col0, #T_7cb1b_row25_col1, #T_7cb1b_row25_col2, #T_7cb1b_row25_col3, #T_7cb1b_row25_col4, #T_7cb1b_row26_col0, #T_7cb1b_row26_col1, #T_7cb1b_row26_col2, #T_7cb1b_row26_col3, #T_7cb1b_row26_col4, #T_7cb1b_row27_col0, #T_7cb1b_row27_col1, #T_7cb1b_row27_col2, #T_7cb1b_row27_col3, #T_7cb1b_row27_col4 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_7cb1b\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_7cb1b_level0_col0\" class=\"col_heading level0 col0\" >Test Suite ID</th>\n", - " <th id=\"T_7cb1b_level0_col1\" class=\"col_heading level0 col1\" >Test Suite Name</th>\n", - " <th id=\"T_7cb1b_level0_col2\" class=\"col_heading level0 col2\" >Test Suite Section</th>\n", - " <th id=\"T_7cb1b_level0_col3\" class=\"col_heading level0 col3\" >Test ID</th>\n", - " <th id=\"T_7cb1b_level0_col4\" class=\"col_heading level0 col4\" >Test Name</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row0_col0\" class=\"data row0 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row0_col1\" class=\"data row0 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row0_col2\" class=\"data row0 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row0_col3\" class=\"data row0 col3\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_7cb1b_row0_col4\" class=\"data row0 col4\" >Dataset Description</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row1_col2\" class=\"data row1 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_7cb1b_row1_col4\" class=\"data row1 col4\" >Descriptive Statistics</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row2_col0\" class=\"data row2 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row2_col1\" class=\"data row2 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row2_col2\" class=\"data row2 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row2_col3\" class=\"data row2 col3\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_7cb1b_row2_col4\" class=\"data row2 col4\" >Pearson Correlation Matrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row3_col0\" class=\"data row3 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row3_col1\" class=\"data row3 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row3_col2\" class=\"data row3 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row3_col3\" class=\"data row3 col3\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_7cb1b_row3_col4\" class=\"data row3 col4\" >Class Imbalance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row4_col0\" class=\"data row4 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row4_col1\" class=\"data row4 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row4_col2\" class=\"data row4 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row4_col3\" class=\"data row4 col3\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_7cb1b_row4_col4\" class=\"data row4 col4\" >Duplicates</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row5_col0\" class=\"data row5 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row5_col1\" class=\"data row5 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row5_col2\" class=\"data row5 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row5_col3\" class=\"data row5 col3\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_7cb1b_row5_col4\" class=\"data row5 col4\" >High Cardinality</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row6_col0\" class=\"data row6 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row6_col1\" class=\"data row6 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row6_col2\" class=\"data row6 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row6_col3\" class=\"data row6 col3\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_7cb1b_row6_col4\" class=\"data row6 col4\" >High Pearson Correlation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row7_col0\" class=\"data row7 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row7_col1\" class=\"data row7 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row7_col2\" class=\"data row7 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row7_col3\" class=\"data row7 col3\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_7cb1b_row7_col4\" class=\"data row7 col4\" >Missing Values</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row8_col0\" class=\"data row8 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row8_col1\" class=\"data row8 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row8_col2\" class=\"data row8 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row8_col3\" class=\"data row8 col3\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_7cb1b_row8_col4\" class=\"data row8 col4\" >Skewness</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row9_col0\" class=\"data row9 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row9_col1\" class=\"data row9 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row9_col2\" class=\"data row9 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row9_col3\" class=\"data row9 col3\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_7cb1b_row9_col4\" class=\"data row9 col4\" >Unique Rows</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row10_col0\" class=\"data row10 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row10_col1\" class=\"data row10 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row10_col2\" class=\"data row10 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row10_col3\" class=\"data row10 col3\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_7cb1b_row10_col4\" class=\"data row10 col4\" >Too Many Zero Values</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row11_col0\" class=\"data row11 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row11_col1\" class=\"data row11 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row11_col2\" class=\"data row11 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.ModelMetadata</td>\n", - " <td id=\"T_7cb1b_row11_col4\" class=\"data row11 col4\" >Model Metadata</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row12_col0\" class=\"data row12 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row12_col1\" class=\"data row12 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row12_col2\" class=\"data row12 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_7cb1b_row12_col4\" class=\"data row12 col4\" >Dataset Split</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row13_col0\" class=\"data row13 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row13_col1\" class=\"data row13 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row13_col2\" class=\"data row13 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row13_col3\" class=\"data row13 col3\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_7cb1b_row13_col4\" class=\"data row13 col4\" >Confusion Matrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row14_col0\" class=\"data row14 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row14_col1\" class=\"data row14 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row14_col2\" class=\"data row14 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row14_col3\" class=\"data row14 col3\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_7cb1b_row14_col4\" class=\"data row14 col4\" >Classifier Performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row15_col0\" class=\"data row15 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row15_col1\" class=\"data row15 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row15_col2\" class=\"data row15 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row15_col3\" class=\"data row15 col3\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_7cb1b_row15_col4\" class=\"data row15 col4\" >Permutation Feature Importance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row16_col0\" class=\"data row16 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row16_col1\" class=\"data row16 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row16_col2\" class=\"data row16 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row16_col3\" class=\"data row16 col3\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_7cb1b_row16_col4\" class=\"data row16 col4\" >Precision Recall Curve</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row17_col0\" class=\"data row17 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row17_col1\" class=\"data row17 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row17_col2\" class=\"data row17 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_7cb1b_row17_col4\" class=\"data row17 col4\" >ROC Curve</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row18_col0\" class=\"data row18 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row18_col1\" class=\"data row18 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row18_col2\" class=\"data row18 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row18_col3\" class=\"data row18 col3\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_7cb1b_row18_col4\" class=\"data row18 col4\" >Population Stability Index</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row19_col0\" class=\"data row19 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row19_col1\" class=\"data row19 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row19_col2\" class=\"data row19 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_7cb1b_row19_col4\" class=\"data row19 col4\" >SHAP Global Importance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row20_col0\" class=\"data row20 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row20_col1\" class=\"data row20 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row20_col2\" class=\"data row20 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_7cb1b_row20_col4\" class=\"data row20 col4\" >Minimum Accuracy</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row21_col0\" class=\"data row21 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row21_col1\" class=\"data row21 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row21_col2\" class=\"data row21 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row21_col3\" class=\"data row21 col3\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_7cb1b_row21_col4\" class=\"data row21 col4\" >Minimum F1 Score</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row22_col0\" class=\"data row22 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row22_col1\" class=\"data row22 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row22_col2\" class=\"data row22 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row22_col3\" class=\"data row22 col3\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_7cb1b_row22_col4\" class=\"data row22 col4\" >Minimum ROCAUC Score</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row23_col0\" class=\"data row23 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row23_col1\" class=\"data row23 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row23_col2\" class=\"data row23 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row23_col3\" class=\"data row23 col3\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_7cb1b_row23_col4\" class=\"data row23 col4\" >Training Test Degradation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row24_col0\" class=\"data row24 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row24_col1\" class=\"data row24 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row24_col2\" class=\"data row24 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row24_col3\" class=\"data row24 col3\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_7cb1b_row24_col4\" class=\"data row24 col4\" >Models Performance Comparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row25_col0\" class=\"data row25 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row25_col1\" class=\"data row25 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row25_col2\" class=\"data row25 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row25_col3\" class=\"data row25 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_7cb1b_row25_col4\" class=\"data row25 col4\" >Overfit Diagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row26_col0\" class=\"data row26 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row26_col1\" class=\"data row26 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row26_col2\" class=\"data row26 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row26_col3\" class=\"data row26 col3\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_7cb1b_row26_col4\" class=\"data row26 col4\" >Weakspots Diagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row27_col0\" class=\"data row27 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row27_col1\" class=\"data row27 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row27_col2\" class=\"data row27 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row27_col3\" class=\"data row27 col3\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_7cb1b_row27_col4\" class=\"data row27 col4\" >Robustness Diagnosis</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "metadata": {}, + "source": [ + "import validmind as vm\n", + "\n", + "vm.test_suites.list_suites()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x16a167fa0>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_9e889 th {\n", + " text-align: left;\n", + "}\n", + "#T_9e889_row0_col0, #T_9e889_row0_col1, #T_9e889_row0_col2, #T_9e889_row0_col3, #T_9e889_row1_col0, #T_9e889_row1_col1, #T_9e889_row1_col2, #T_9e889_row1_col3, #T_9e889_row2_col0, #T_9e889_row2_col1, #T_9e889_row2_col2, #T_9e889_row2_col3, #T_9e889_row3_col0, #T_9e889_row3_col1, #T_9e889_row3_col2, #T_9e889_row3_col3, #T_9e889_row4_col0, #T_9e889_row4_col1, #T_9e889_row4_col2, #T_9e889_row4_col3, #T_9e889_row5_col0, #T_9e889_row5_col1, #T_9e889_row5_col2, #T_9e889_row5_col3, #T_9e889_row6_col0, #T_9e889_row6_col1, #T_9e889_row6_col2, #T_9e889_row6_col3, #T_9e889_row7_col0, #T_9e889_row7_col1, #T_9e889_row7_col2, #T_9e889_row7_col3, #T_9e889_row8_col0, #T_9e889_row8_col1, #T_9e889_row8_col2, #T_9e889_row8_col3, #T_9e889_row9_col0, #T_9e889_row9_col1, #T_9e889_row9_col2, #T_9e889_row9_col3, #T_9e889_row10_col0, #T_9e889_row10_col1, #T_9e889_row10_col2, #T_9e889_row10_col3, #T_9e889_row11_col0, #T_9e889_row11_col1, #T_9e889_row11_col2, #T_9e889_row11_col3, #T_9e889_row12_col0, #T_9e889_row12_col1, #T_9e889_row12_col2, #T_9e889_row12_col3, #T_9e889_row13_col0, #T_9e889_row13_col1, #T_9e889_row13_col2, #T_9e889_row13_col3, #T_9e889_row14_col0, #T_9e889_row14_col1, #T_9e889_row14_col2, #T_9e889_row14_col3, #T_9e889_row15_col0, #T_9e889_row15_col1, #T_9e889_row15_col2, #T_9e889_row15_col3, #T_9e889_row16_col0, #T_9e889_row16_col1, #T_9e889_row16_col2, #T_9e889_row16_col3, #T_9e889_row17_col0, #T_9e889_row17_col1, #T_9e889_row17_col2, #T_9e889_row17_col3, #T_9e889_row18_col0, #T_9e889_row18_col1, #T_9e889_row18_col2, #T_9e889_row18_col3, #T_9e889_row19_col0, #T_9e889_row19_col1, #T_9e889_row19_col2, #T_9e889_row19_col3, #T_9e889_row20_col0, #T_9e889_row20_col1, #T_9e889_row20_col2, #T_9e889_row20_col3, #T_9e889_row21_col0, #T_9e889_row21_col1, #T_9e889_row21_col2, #T_9e889_row21_col3, #T_9e889_row22_col0, #T_9e889_row22_col1, #T_9e889_row22_col2, #T_9e889_row22_col3, #T_9e889_row23_col0, #T_9e889_row23_col1, #T_9e889_row23_col2, #T_9e889_row23_col3, #T_9e889_row24_col0, #T_9e889_row24_col1, #T_9e889_row24_col2, #T_9e889_row24_col3, #T_9e889_row25_col0, #T_9e889_row25_col1, #T_9e889_row25_col2, #T_9e889_row25_col3, #T_9e889_row26_col0, #T_9e889_row26_col1, #T_9e889_row26_col2, #T_9e889_row26_col3, #T_9e889_row27_col0, #T_9e889_row27_col1, #T_9e889_row27_col2, #T_9e889_row27_col3, #T_9e889_row28_col0, #T_9e889_row28_col1, #T_9e889_row28_col2, #T_9e889_row28_col3, #T_9e889_row29_col0, #T_9e889_row29_col1, #T_9e889_row29_col2, #T_9e889_row29_col3 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_9e889\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_9e889_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_9e889_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_9e889_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_9e889_level0_col3\" class=\"col_heading level0 col3\" >Tests</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_9e889_row0_col0\" class=\"data row0 col0\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_9e889_row0_col1\" class=\"data row0 col1\" >ClassifierDiagnosis</td>\n", + " <td id=\"T_9e889_row0_col2\" class=\"data row0 col2\" >Test suite for sklearn classifier model diagnosis tests</td>\n", + " <td id=\"T_9e889_row0_col3\" class=\"data row0 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_9e889_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row1_col2\" class=\"data row1 col2\" >Full test suite for binary classification models.</td>\n", + " <td id=\"T_9e889_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row2_col0\" class=\"data row2 col0\" >classifier_metrics</td>\n", + " <td id=\"T_9e889_row2_col1\" class=\"data row2 col1\" >ClassifierMetrics</td>\n", + " <td id=\"T_9e889_row2_col2\" class=\"data row2 col2\" >Test suite for sklearn classifier metrics</td>\n", + " <td id=\"T_9e889_row2_col3\" class=\"data row2 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row3_col0\" class=\"data row3 col0\" >classifier_model_validation</td>\n", + " <td id=\"T_9e889_row3_col1\" class=\"data row3 col1\" >ClassifierModelValidation</td>\n", + " <td id=\"T_9e889_row3_col2\" class=\"data row3 col2\" >Test suite for binary classification models.</td>\n", + " <td id=\"T_9e889_row3_col3\" class=\"data row3 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row4_col0\" class=\"data row4 col0\" >classifier_validation</td>\n", + " <td id=\"T_9e889_row4_col1\" class=\"data row4 col1\" >ClassifierPerformance</td>\n", + " <td id=\"T_9e889_row4_col2\" class=\"data row4 col2\" >Test suite for sklearn classifier models</td>\n", + " <td id=\"T_9e889_row4_col3\" class=\"data row4 col3\" >validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row5_col0\" class=\"data row5 col0\" >cluster_full_suite</td>\n", + " <td id=\"T_9e889_row5_col1\" class=\"data row5 col1\" >ClusterFullSuite</td>\n", + " <td id=\"T_9e889_row5_col2\" class=\"data row5 col2\" >Full test suite for clustering models.</td>\n", + " <td id=\"T_9e889_row5_col3\" class=\"data row5 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot, validmind.model_validation.ClusterSizeDistribution, validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row6_col0\" class=\"data row6 col0\" >cluster_metrics</td>\n", + " <td id=\"T_9e889_row6_col1\" class=\"data row6 col1\" >ClusterMetrics</td>\n", + " <td id=\"T_9e889_row6_col2\" class=\"data row6 col2\" >Test suite for sklearn clustering metrics</td>\n", + " <td id=\"T_9e889_row6_col3\" class=\"data row6 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row7_col0\" class=\"data row7 col0\" >cluster_performance</td>\n", + " <td id=\"T_9e889_row7_col1\" class=\"data row7 col1\" >ClusterPerformance</td>\n", + " <td id=\"T_9e889_row7_col2\" class=\"data row7 col2\" >Test suite for sklearn cluster performance</td>\n", + " <td id=\"T_9e889_row7_col3\" class=\"data row7 col3\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row8_col0\" class=\"data row8 col0\" >embeddings_full_suite</td>\n", + " <td id=\"T_9e889_row8_col1\" class=\"data row8 col1\" >EmbeddingsFullSuite</td>\n", + " <td id=\"T_9e889_row8_col2\" class=\"data row8 col2\" >Full test suite for embeddings models.</td>\n", + " <td id=\"T_9e889_row8_col3\" class=\"data row8 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D, validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row9_col0\" class=\"data row9 col0\" >embeddings_metrics</td>\n", + " <td id=\"T_9e889_row9_col1\" class=\"data row9 col1\" >EmbeddingsMetrics</td>\n", + " <td id=\"T_9e889_row9_col2\" class=\"data row9 col2\" >Test suite for embeddings metrics</td>\n", + " <td id=\"T_9e889_row9_col3\" class=\"data row9 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row10_col0\" class=\"data row10 col0\" >embeddings_model_performance</td>\n", + " <td id=\"T_9e889_row10_col1\" class=\"data row10 col1\" >EmbeddingsPerformance</td>\n", + " <td id=\"T_9e889_row10_col2\" class=\"data row10 col2\" >Test suite for embeddings model performance</td>\n", + " <td id=\"T_9e889_row10_col3\" class=\"data row10 col3\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row11_col0\" class=\"data row11 col0\" >hyper_parameters_optimization</td>\n", + " <td id=\"T_9e889_row11_col1\" class=\"data row11 col1\" >KmeansParametersOptimization</td>\n", + " <td id=\"T_9e889_row11_col2\" class=\"data row11 col2\" >Test suite for sklearn hyperparameters optimization</td>\n", + " <td id=\"T_9e889_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row12_col0\" class=\"data row12 col0\" >llm_classifier_full_suite</td>\n", + " <td id=\"T_9e889_row12_col1\" class=\"data row12 col1\" >LLMClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row12_col2\" class=\"data row12 col2\" >Full test suite for LLM classification models.</td>\n", + " <td id=\"T_9e889_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis, validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row13_col0\" class=\"data row13 col0\" >prompt_validation</td>\n", + " <td id=\"T_9e889_row13_col1\" class=\"data row13 col1\" >PromptValidation</td>\n", + " <td id=\"T_9e889_row13_col2\" class=\"data row13 col2\" >Test suite for prompt validation</td>\n", + " <td id=\"T_9e889_row13_col3\" class=\"data row13 col3\" >validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row14_col0\" class=\"data row14 col0\" >nlp_classifier_full_suite</td>\n", + " <td id=\"T_9e889_row14_col1\" class=\"data row14 col1\" >NLPClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row14_col2\" class=\"data row14 col2\" >Full test suite for NLP classification models.</td>\n", + " <td id=\"T_9e889_row14_col3\" class=\"data row14 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row15_col0\" class=\"data row15 col0\" >regression_metrics</td>\n", + " <td id=\"T_9e889_row15_col1\" class=\"data row15 col1\" >RegressionMetrics</td>\n", + " <td id=\"T_9e889_row15_col2\" class=\"data row15 col2\" >Test suite for performance metrics of regression metrics</td>\n", + " <td id=\"T_9e889_row15_col3\" class=\"data row15 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row16_col0\" class=\"data row16 col0\" >regression_model_description</td>\n", + " <td id=\"T_9e889_row16_col1\" class=\"data row16 col1\" >RegressionModelDescription</td>\n", + " <td id=\"T_9e889_row16_col2\" class=\"data row16 col2\" >Test suite for performance metric of regression model of statsmodels library</td>\n", + " <td id=\"T_9e889_row16_col3\" class=\"data row16 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row17_col0\" class=\"data row17 col0\" >regression_models_evaluation</td>\n", + " <td id=\"T_9e889_row17_col1\" class=\"data row17 col1\" >RegressionModelsEvaluation</td>\n", + " <td id=\"T_9e889_row17_col2\" class=\"data row17 col2\" >Test suite for metrics comparison of regression model of statsmodels library</td>\n", + " <td id=\"T_9e889_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row18_col0\" class=\"data row18 col0\" >regression_full_suite</td>\n", + " <td id=\"T_9e889_row18_col1\" class=\"data row18 col1\" >RegressionFullSuite</td>\n", + " <td id=\"T_9e889_row18_col2\" class=\"data row18 col2\" >Full test suite for regression models.</td>\n", + " <td id=\"T_9e889_row18_col3\" class=\"data row18 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row19_col0\" class=\"data row19 col0\" >regression_performance</td>\n", + " <td id=\"T_9e889_row19_col1\" class=\"data row19 col1\" >RegressionPerformance</td>\n", + " <td id=\"T_9e889_row19_col2\" class=\"data row19 col2\" >Test suite for regression model performance</td>\n", + " <td id=\"T_9e889_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row20_col0\" class=\"data row20 col0\" >summarization_metrics</td>\n", + " <td id=\"T_9e889_row20_col1\" class=\"data row20 col1\" >SummarizationMetrics</td>\n", + " <td id=\"T_9e889_row20_col2\" class=\"data row20 col2\" >Test suite for Summarization metrics</td>\n", + " <td id=\"T_9e889_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.TokenDisparity, validmind.model_validation.BleuScore, validmind.model_validation.BertScore, validmind.model_validation.ContextualRecall</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row21_col0\" class=\"data row21 col0\" >tabular_dataset</td>\n", + " <td id=\"T_9e889_row21_col1\" class=\"data row21 col1\" >TabularDataset</td>\n", + " <td id=\"T_9e889_row21_col2\" class=\"data row21 col2\" >Test suite for tabular datasets.</td>\n", + " <td id=\"T_9e889_row21_col3\" class=\"data row21 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row22_col0\" class=\"data row22 col0\" >tabular_dataset_description</td>\n", + " <td id=\"T_9e889_row22_col1\" class=\"data row22 col1\" >TabularDatasetDescription</td>\n", + " <td id=\"T_9e889_row22_col2\" class=\"data row22 col2\" >Test suite to extract metadata and descriptive\n", + "statistics from a tabular dataset</td>\n", + " <td id=\"T_9e889_row22_col3\" class=\"data row22 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row23_col0\" class=\"data row23 col0\" >tabular_data_quality</td>\n", + " <td id=\"T_9e889_row23_col1\" class=\"data row23 col1\" >TabularDataQuality</td>\n", + " <td id=\"T_9e889_row23_col2\" class=\"data row23 col2\" >Test suite for data quality on tabular datasets</td>\n", + " <td id=\"T_9e889_row23_col3\" class=\"data row23 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row24_col0\" class=\"data row24 col0\" >text_data_quality</td>\n", + " <td id=\"T_9e889_row24_col1\" class=\"data row24 col1\" >TextDataQuality</td>\n", + " <td id=\"T_9e889_row24_col2\" class=\"data row24 col2\" >Test suite for data quality on text data</td>\n", + " <td id=\"T_9e889_row24_col3\" class=\"data row24 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row25_col0\" class=\"data row25 col0\" >time_series_data_quality</td>\n", + " <td id=\"T_9e889_row25_col1\" class=\"data row25 col1\" >TimeSeriesDataQuality</td>\n", + " <td id=\"T_9e889_row25_col2\" class=\"data row25 col2\" >Test suite for data quality on time series datasets</td>\n", + " <td id=\"T_9e889_row25_col3\" class=\"data row25 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row26_col0\" class=\"data row26 col0\" >time_series_dataset</td>\n", + " <td id=\"T_9e889_row26_col1\" class=\"data row26 col1\" >TimeSeriesDataset</td>\n", + " <td id=\"T_9e889_row26_col2\" class=\"data row26 col2\" >Test suite for time series datasets.</td>\n", + " <td id=\"T_9e889_row26_col3\" class=\"data row26 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency, validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA, validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row27_col0\" class=\"data row27 col0\" >time_series_model_validation</td>\n", + " <td id=\"T_9e889_row27_col1\" class=\"data row27 col1\" >TimeSeriesModelValidation</td>\n", + " <td id=\"T_9e889_row27_col2\" class=\"data row27 col2\" >Test suite for time series model validation.</td>\n", + " <td id=\"T_9e889_row27_col3\" class=\"data row27 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row28_col0\" class=\"data row28 col0\" >time_series_multivariate</td>\n", + " <td id=\"T_9e889_row28_col1\" class=\"data row28 col1\" >TimeSeriesMultivariate</td>\n", + " <td id=\"T_9e889_row28_col2\" class=\"data row28 col2\" >This test suite provides a preliminary understanding of the features\n", + "and relationship in multivariate dataset. It presents various\n", + "multivariate visualizations that can help identify patterns, trends,\n", + "and relationships between pairs of variables. The visualizations are\n", + "designed to explore the relationships between multiple features\n", + "simultaneously. They allow you to quickly identify any patterns or\n", + "trends in the data, as well as any potential outliers or anomalies.\n", + "The individual feature distribution can also be explored to provide\n", + "insight into the range and frequency of values observed in the data.\n", + "This multivariate analysis test suite aims to provide an overview of\n", + "the data structure and guide further exploration and modeling.</td>\n", + " <td id=\"T_9e889_row28_col3\" class=\"data row28 col3\" >validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row29_col0\" class=\"data row29 col0\" >time_series_univariate</td>\n", + " <td id=\"T_9e889_row29_col1\" class=\"data row29 col1\" >TimeSeriesUnivariate</td>\n", + " <td id=\"T_9e889_row29_col2\" class=\"data row29 col2\" >This test suite provides a preliminary understanding of the target variable(s)\n", + "used in the time series dataset. It visualizations that present the raw time\n", + "series data and a histogram of the target variable(s).\n", + "\n", + "The raw time series data provides a visual inspection of the target variable's\n", + "behavior over time. This helps to identify any patterns or trends in the data,\n", + "as well as any potential outliers or anomalies. The histogram of the target\n", + "variable displays the distribution of values, providing insight into the range\n", + "and frequency of values observed in the data.</td>\n", + " <td id=\"T_9e889_row29_col3\" class=\"data row29 col3\" >validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x16a11ae00>" + ] + } + } ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "vm.test_suites.describe_suite(\"classifier_full_suite\", verbose=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### View test details\n", - "\n", - "To inspect a specific test in a suite, pass the name of the test to [tests.describe_test()](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to get detailed information about the test such as its purpose, strengths and limitations:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "\n", - " <div class=\"vm-accordion\" id=\"accordion-c38a3af7\">\n", - " \n", - " <div class=\"vm-accordion-item\">\n", - " <div class=\"vm-accordion-header\"\n", - " onclick=\"toggleAccordionItem('accordion-c38a3af7-item-0')\"\n", - " style=\"cursor: pointer; padding: 10px; background-color: #f8f9fa; border: 1px solid #dee2e6; font-weight: bold;\">\n", - " <span class=\"vm-accordion-toggle\" id=\"accordion-c38a3af7-item-0-toggle\">▶</span>\n", - " Test: Descriptive Statistics ('validmind.data_validation.DescriptiveStatistics')\n", - " </div>\n", - " <div class=\"vm-accordion-content\"\n", - " id=\"accordion-c38a3af7-item-0\"\n", - " style=\"display: none; padding: 15px; border: 1px solid #dee2e6; border-top: none;\">\n", - " \n", - "<div>\n", - " <h2>Descriptive Statistics</h2>\n", - " <div style=\"border: 1px solid #ddd; border-radius: 4px; padding: 10px; margin: 10px 0;\">\n", - " <p>Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's\n", - "dataset.</p>\n", - "<h3>Purpose</h3>\n", - "<p>The purpose of the Descriptive Statistics metric is to provide a comprehensive summary of both numerical and\n", - "categorical data within a dataset. This involves statistics such as count, mean, standard deviation, minimum and\n", - "maximum values for numerical data. For categorical data, it calculates the count, number of unique values, most\n", - "common value and its frequency, and the proportion of the most frequent value relative to the total. The goal is to\n", - "visualize the overall distribution of the variables in the dataset, aiding in understanding the model's behavior\n", - "and predicting its performance.</p>\n", - "<h3>Test Mechanism</h3>\n", - "<p>The testing mechanism utilizes two in-built functions of pandas dataframes: <code>describe()</code> for numerical fields and\n", - "<code>value_counts()</code> for categorical fields. The <code>describe()</code> function pulls out several summary statistics, while\n", - "<code>value_counts()</code> accounts for unique values. The resulting data is formatted into two distinct tables, one for\n", - "numerical and another for categorical variable summaries. These tables provide a clear summary of the main\n", - "characteristics of the variables, which can be instrumental in assessing the model's performance.</p>\n", - "<h3>Signs of High Risk</h3>\n", - "<ul>\n", - "<li>Skewed data or significant outliers can represent high risk. For numerical data, this may be reflected via a\n", - "significant difference between the mean and median (50% percentile).</li>\n", - "<li>For categorical data, a lack of diversity (low count of unique values), or overdominance of a single category\n", - "(high frequency of the top value) can indicate high risk.</li>\n", - "</ul>\n", - "<h3>Strengths</h3>\n", - "<ul>\n", - "<li>Provides a comprehensive summary of the dataset, shedding light on the distribution and characteristics of the\n", - "variables under consideration.</li>\n", - "<li>It is a versatile and robust method, applicable to both numerical and categorical data.</li>\n", - "<li>Helps highlight crucial anomalies such as outliers, extreme skewness, or lack of diversity, which are vital in\n", - "understanding model behavior during testing and validation.</li>\n", - "</ul>\n", - "<h3>Limitations</h3>\n", - "<ul>\n", - "<li>While this metric offers a high-level overview of the data, it may fail to detect subtle correlations or complex\n", - "patterns.</li>\n", - "<li>Does not offer any insights on the relationship between variables.</li>\n", - "<li>Alone, descriptive statistics cannot be used to infer properties about future unseen data.</li>\n", - "<li>Should be used in conjunction with other statistical tests to provide a comprehensive understanding of the\n", - "model's data.</li>\n", - "</ul>\n", - "\n", - " </div>\n", - "</div>\n", - "\n", - "<h4 class=\"vm_required_context\">\n", - " Required Inputs: <span style=\"font-size: 13px\"><i>dataset</i></span>\n", - "</h4>\n", - "\n", - "<div style=\"display: none;\">\n", - " <h4>Parameters:</h4>\n", - " <table class=\"vm_params_table\" style=\"display: none;\">\n", - " <tr>\n", - " <th>Parameter</th>\n", - " <th>Default Value</th>\n", - " </tr>\n", - " \n", - " </table>\n", - "</div>\n", - "\n", - "<div class=\"unset\">\n", - " <h3>How to Run:</h3>\n", - "\n", - " <button\n", - " onclick=\"(() => {e = document.getElementById('expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b'); e.style.display === 'none' ? e.style.display = 'block' : e.style.display = 'none'})()\"\n", - " >Show/Hide Instructions</button>\n", - "\n", - " <div id=\"expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b\" style=\"display: block;\">\n", - " <h4>Code:</h4>\n", - " <pre>\n", - " <code class='language-python'>\n", - "import validmind as vm\n", - "\n", - "# inputs dictionary maps your inputs to the expected input names\n", - "# keys are the expected input names and values are the actual inputs\n", - "# values may be string input_ids or the actual VMDataset or VMModel objects\n", - "inputs = {\n", - " \"dataset\": \"my_vm_dataset\"\n", - "}\n", - "params = {}\n", - "\n", - "# to run and view the result of this test, run the following code:\n", - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics\", inputs=inputs, params=params\n", - ")\n", - "\n", - "# To see the result of the test, ensure that you have called `vm.init()` and then run:\n", - "result.log()</code>\n", - " </pre>\n", - " </div>\n", - "</div>\n", - "\n", - "<style>\n", - "h5.vm_required_context {\n", - " margin-top: 25px;\n", - "}\n", - "table.vm_params_table {\n", - " margin-top: 20px;\n", - " width: 350px;\n", - " border-collapse: collapse;\n", - " border-color: --jp-border-color0;\n", - "}\n", - "table.vm_params_table td, table.vm_params_table th {\n", - " text-align: right;\n", - "}\n", - "table.vm_params_table td:first-child, table.vm_params_table th:first-child {\n", - " text-align: left;\n", - "}\n", - "table.vm_params_table th {\n", - " background-color: --jp-content-color0;\n", - " font-weight: bold;\n", - " font-size: 14px !important;\n", - "}\n", - "table.vm_params_table tr:nth-child(even) {\n", - " background-color: --jp-layout-color1;\n", - "}\n", - "table.vm_params_table tr:nth-child(odd) {\n", - " background-color: --jp-layout-color2;\n", - "}\n", - "table.vm_params_table tr:hover {\n", - " background-color: --jp-layout-color3;\n", - "}\n", - "table.vm_params_table td, table.vm_params_table th {\n", - " padding: 5px;\n", - " border: .8px solid --jp-border-color0;\n", - "}\n", - "</style>\n", - "\n", - " </div>\n", - " </div>\n", - " \n", - " </div>\n", - "\n", - " <script>\n", - " function toggleAccordionItem(itemId) {\n", - " const content = document.getElementById(itemId);\n", - " const toggle = document.getElementById(itemId + '-toggle');\n", - "\n", - " if (content.style.display === 'none' || content.style.display === '') {\n", - " content.style.display = 'block';\n", - " toggle.innerHTML = '▼';\n", - " } else {\n", - " content.style.display = 'none';\n", - " toggle.innerHTML = '▶';\n", - " }\n", - " }\n", - " </script>\n", - " " + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## View test suite details\n", + "\n", + "Use the [test_suites.describe_suite()](https://docs.validmind.ai/validmind/validmind/test_suites.html#describe_suite) function to retrieve information about a test suite, including its name, description, and the list of tests it contains. \n", + "\n", + "You can call `test_suites.describe_suite()` with just the test suite ID to get basic details, or pass an additional `verbose` parameter for a more comprehensive output: \n", + "\n", + "- **Test ID** - The identifier of the test suite you want to inspect.\n", + "- **Verbose** - A Boolean flag. Set `verbose=True` to return a full breakdown of the test suite." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.test_suites.describe_suite(\"classifier_full_suite\", verbose=True)" ], - "text/plain": [ - "<IPython.core.display.HTML object>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_7cb1b th {\n", + " text-align: left;\n", + "}\n", + "#T_7cb1b_row0_col0, #T_7cb1b_row0_col1, #T_7cb1b_row0_col2, #T_7cb1b_row0_col3, #T_7cb1b_row0_col4, #T_7cb1b_row1_col0, #T_7cb1b_row1_col1, #T_7cb1b_row1_col2, #T_7cb1b_row1_col3, #T_7cb1b_row1_col4, #T_7cb1b_row2_col0, #T_7cb1b_row2_col1, #T_7cb1b_row2_col2, #T_7cb1b_row2_col3, #T_7cb1b_row2_col4, #T_7cb1b_row3_col0, #T_7cb1b_row3_col1, #T_7cb1b_row3_col2, #T_7cb1b_row3_col3, #T_7cb1b_row3_col4, #T_7cb1b_row4_col0, #T_7cb1b_row4_col1, #T_7cb1b_row4_col2, #T_7cb1b_row4_col3, #T_7cb1b_row4_col4, #T_7cb1b_row5_col0, #T_7cb1b_row5_col1, #T_7cb1b_row5_col2, #T_7cb1b_row5_col3, #T_7cb1b_row5_col4, #T_7cb1b_row6_col0, #T_7cb1b_row6_col1, #T_7cb1b_row6_col2, #T_7cb1b_row6_col3, #T_7cb1b_row6_col4, #T_7cb1b_row7_col0, #T_7cb1b_row7_col1, #T_7cb1b_row7_col2, #T_7cb1b_row7_col3, #T_7cb1b_row7_col4, #T_7cb1b_row8_col0, #T_7cb1b_row8_col1, #T_7cb1b_row8_col2, #T_7cb1b_row8_col3, #T_7cb1b_row8_col4, #T_7cb1b_row9_col0, #T_7cb1b_row9_col1, #T_7cb1b_row9_col2, #T_7cb1b_row9_col3, #T_7cb1b_row9_col4, #T_7cb1b_row10_col0, #T_7cb1b_row10_col1, #T_7cb1b_row10_col2, #T_7cb1b_row10_col3, #T_7cb1b_row10_col4, #T_7cb1b_row11_col0, #T_7cb1b_row11_col1, #T_7cb1b_row11_col2, #T_7cb1b_row11_col3, #T_7cb1b_row11_col4, #T_7cb1b_row12_col0, #T_7cb1b_row12_col1, #T_7cb1b_row12_col2, #T_7cb1b_row12_col3, #T_7cb1b_row12_col4, #T_7cb1b_row13_col0, #T_7cb1b_row13_col1, #T_7cb1b_row13_col2, #T_7cb1b_row13_col3, #T_7cb1b_row13_col4, #T_7cb1b_row14_col0, #T_7cb1b_row14_col1, #T_7cb1b_row14_col2, #T_7cb1b_row14_col3, #T_7cb1b_row14_col4, #T_7cb1b_row15_col0, #T_7cb1b_row15_col1, #T_7cb1b_row15_col2, #T_7cb1b_row15_col3, #T_7cb1b_row15_col4, #T_7cb1b_row16_col0, #T_7cb1b_row16_col1, #T_7cb1b_row16_col2, #T_7cb1b_row16_col3, #T_7cb1b_row16_col4, #T_7cb1b_row17_col0, #T_7cb1b_row17_col1, #T_7cb1b_row17_col2, #T_7cb1b_row17_col3, #T_7cb1b_row17_col4, #T_7cb1b_row18_col0, #T_7cb1b_row18_col1, #T_7cb1b_row18_col2, #T_7cb1b_row18_col3, #T_7cb1b_row18_col4, #T_7cb1b_row19_col0, #T_7cb1b_row19_col1, #T_7cb1b_row19_col2, #T_7cb1b_row19_col3, #T_7cb1b_row19_col4, #T_7cb1b_row20_col0, #T_7cb1b_row20_col1, #T_7cb1b_row20_col2, #T_7cb1b_row20_col3, #T_7cb1b_row20_col4, #T_7cb1b_row21_col0, #T_7cb1b_row21_col1, #T_7cb1b_row21_col2, #T_7cb1b_row21_col3, #T_7cb1b_row21_col4, #T_7cb1b_row22_col0, #T_7cb1b_row22_col1, #T_7cb1b_row22_col2, #T_7cb1b_row22_col3, #T_7cb1b_row22_col4, #T_7cb1b_row23_col0, #T_7cb1b_row23_col1, #T_7cb1b_row23_col2, #T_7cb1b_row23_col3, #T_7cb1b_row23_col4, #T_7cb1b_row24_col0, #T_7cb1b_row24_col1, #T_7cb1b_row24_col2, #T_7cb1b_row24_col3, #T_7cb1b_row24_col4, #T_7cb1b_row25_col0, #T_7cb1b_row25_col1, #T_7cb1b_row25_col2, #T_7cb1b_row25_col3, #T_7cb1b_row25_col4, #T_7cb1b_row26_col0, #T_7cb1b_row26_col1, #T_7cb1b_row26_col2, #T_7cb1b_row26_col3, #T_7cb1b_row26_col4, #T_7cb1b_row27_col0, #T_7cb1b_row27_col1, #T_7cb1b_row27_col2, #T_7cb1b_row27_col3, #T_7cb1b_row27_col4 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_7cb1b\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_7cb1b_level0_col0\" class=\"col_heading level0 col0\" >Test Suite ID</th>\n", + " <th id=\"T_7cb1b_level0_col1\" class=\"col_heading level0 col1\" >Test Suite Name</th>\n", + " <th id=\"T_7cb1b_level0_col2\" class=\"col_heading level0 col2\" >Test Suite Section</th>\n", + " <th id=\"T_7cb1b_level0_col3\" class=\"col_heading level0 col3\" >Test ID</th>\n", + " <th id=\"T_7cb1b_level0_col4\" class=\"col_heading level0 col4\" >Test Name</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row0_col0\" class=\"data row0 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row0_col1\" class=\"data row0 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row0_col2\" class=\"data row0 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row0_col3\" class=\"data row0 col3\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_7cb1b_row0_col4\" class=\"data row0 col4\" >Dataset Description</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row1_col2\" class=\"data row1 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_7cb1b_row1_col4\" class=\"data row1 col4\" >Descriptive Statistics</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row2_col0\" class=\"data row2 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row2_col1\" class=\"data row2 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row2_col2\" class=\"data row2 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row2_col3\" class=\"data row2 col3\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_7cb1b_row2_col4\" class=\"data row2 col4\" >Pearson Correlation Matrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row3_col0\" class=\"data row3 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row3_col1\" class=\"data row3 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row3_col2\" class=\"data row3 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row3_col3\" class=\"data row3 col3\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_7cb1b_row3_col4\" class=\"data row3 col4\" >Class Imbalance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row4_col0\" class=\"data row4 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row4_col1\" class=\"data row4 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row4_col2\" class=\"data row4 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row4_col3\" class=\"data row4 col3\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_7cb1b_row4_col4\" class=\"data row4 col4\" >Duplicates</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row5_col0\" class=\"data row5 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row5_col1\" class=\"data row5 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row5_col2\" class=\"data row5 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row5_col3\" class=\"data row5 col3\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_7cb1b_row5_col4\" class=\"data row5 col4\" >High Cardinality</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row6_col0\" class=\"data row6 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row6_col1\" class=\"data row6 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row6_col2\" class=\"data row6 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row6_col3\" class=\"data row6 col3\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_7cb1b_row6_col4\" class=\"data row6 col4\" >High Pearson Correlation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row7_col0\" class=\"data row7 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row7_col1\" class=\"data row7 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row7_col2\" class=\"data row7 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row7_col3\" class=\"data row7 col3\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_7cb1b_row7_col4\" class=\"data row7 col4\" >Missing Values</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row8_col0\" class=\"data row8 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row8_col1\" class=\"data row8 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row8_col2\" class=\"data row8 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row8_col3\" class=\"data row8 col3\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_7cb1b_row8_col4\" class=\"data row8 col4\" >Skewness</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row9_col0\" class=\"data row9 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row9_col1\" class=\"data row9 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row9_col2\" class=\"data row9 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row9_col3\" class=\"data row9 col3\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_7cb1b_row9_col4\" class=\"data row9 col4\" >Unique Rows</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row10_col0\" class=\"data row10 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row10_col1\" class=\"data row10 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row10_col2\" class=\"data row10 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row10_col3\" class=\"data row10 col3\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_7cb1b_row10_col4\" class=\"data row10 col4\" >Too Many Zero Values</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row11_col0\" class=\"data row11 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row11_col1\" class=\"data row11 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row11_col2\" class=\"data row11 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.ModelMetadata</td>\n", + " <td id=\"T_7cb1b_row11_col4\" class=\"data row11 col4\" >Model Metadata</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row12_col0\" class=\"data row12 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row12_col1\" class=\"data row12 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row12_col2\" class=\"data row12 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_7cb1b_row12_col4\" class=\"data row12 col4\" >Dataset Split</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row13_col0\" class=\"data row13 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row13_col1\" class=\"data row13 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row13_col2\" class=\"data row13 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row13_col3\" class=\"data row13 col3\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_7cb1b_row13_col4\" class=\"data row13 col4\" >Confusion Matrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row14_col0\" class=\"data row14 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row14_col1\" class=\"data row14 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row14_col2\" class=\"data row14 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row14_col3\" class=\"data row14 col3\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_7cb1b_row14_col4\" class=\"data row14 col4\" >Classifier Performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row15_col0\" class=\"data row15 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row15_col1\" class=\"data row15 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row15_col2\" class=\"data row15 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row15_col3\" class=\"data row15 col3\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_7cb1b_row15_col4\" class=\"data row15 col4\" >Permutation Feature Importance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row16_col0\" class=\"data row16 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row16_col1\" class=\"data row16 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row16_col2\" class=\"data row16 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row16_col3\" class=\"data row16 col3\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_7cb1b_row16_col4\" class=\"data row16 col4\" >Precision Recall Curve</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row17_col0\" class=\"data row17 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row17_col1\" class=\"data row17 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row17_col2\" class=\"data row17 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_7cb1b_row17_col4\" class=\"data row17 col4\" >ROC Curve</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row18_col0\" class=\"data row18 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row18_col1\" class=\"data row18 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row18_col2\" class=\"data row18 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row18_col3\" class=\"data row18 col3\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_7cb1b_row18_col4\" class=\"data row18 col4\" >Population Stability Index</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row19_col0\" class=\"data row19 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row19_col1\" class=\"data row19 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row19_col2\" class=\"data row19 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_7cb1b_row19_col4\" class=\"data row19 col4\" >SHAP Global Importance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row20_col0\" class=\"data row20 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row20_col1\" class=\"data row20 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row20_col2\" class=\"data row20 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_7cb1b_row20_col4\" class=\"data row20 col4\" >Minimum Accuracy</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row21_col0\" class=\"data row21 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row21_col1\" class=\"data row21 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row21_col2\" class=\"data row21 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row21_col3\" class=\"data row21 col3\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_7cb1b_row21_col4\" class=\"data row21 col4\" >Minimum F1 Score</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row22_col0\" class=\"data row22 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row22_col1\" class=\"data row22 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row22_col2\" class=\"data row22 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row22_col3\" class=\"data row22 col3\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_7cb1b_row22_col4\" class=\"data row22 col4\" >Minimum ROCAUC Score</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row23_col0\" class=\"data row23 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row23_col1\" class=\"data row23 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row23_col2\" class=\"data row23 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row23_col3\" class=\"data row23 col3\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_7cb1b_row23_col4\" class=\"data row23 col4\" >Training Test Degradation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row24_col0\" class=\"data row24 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row24_col1\" class=\"data row24 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row24_col2\" class=\"data row24 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row24_col3\" class=\"data row24 col3\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_7cb1b_row24_col4\" class=\"data row24 col4\" >Models Performance Comparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row25_col0\" class=\"data row25 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row25_col1\" class=\"data row25 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row25_col2\" class=\"data row25 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row25_col3\" class=\"data row25 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_7cb1b_row25_col4\" class=\"data row25 col4\" >Overfit Diagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row26_col0\" class=\"data row26 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row26_col1\" class=\"data row26 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row26_col2\" class=\"data row26 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row26_col3\" class=\"data row26 col3\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_7cb1b_row26_col4\" class=\"data row26 col4\" >Weakspots Diagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row27_col0\" class=\"data row27 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row27_col1\" class=\"data row27 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row27_col2\" class=\"data row27 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row27_col3\" class=\"data row27 col3\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_7cb1b_row27_col4\" class=\"data row27 col4\" >Robustness Diagnosis</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x16a167fa0>" + ] + } + } ] - }, - "metadata": {}, - "output_type": "display_data" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### View test details\n", + "\n", + "To inspect a specific test in a suite, pass the name of the test to [tests.describe_test()](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to get detailed information about the test such as its purpose, strengths and limitations:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.data_validation.DescriptiveStatistics\")" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/html": [ + "\n", + " <div class=\"vm-accordion\" id=\"accordion-c38a3af7\">\n", + " \n", + " <div class=\"vm-accordion-item\">\n", + " <div class=\"vm-accordion-header\"\n", + " onclick=\"toggleAccordionItem('accordion-c38a3af7-item-0')\"\n", + " style=\"cursor: pointer; padding: 10px; background-color: #f8f9fa; border: 1px solid #dee2e6; font-weight: bold;\">\n", + " <span class=\"vm-accordion-toggle\" id=\"accordion-c38a3af7-item-0-toggle\">▶</span>\n", + " Test: Descriptive Statistics ('validmind.data_validation.DescriptiveStatistics')\n", + " </div>\n", + " <div class=\"vm-accordion-content\"\n", + " id=\"accordion-c38a3af7-item-0\"\n", + " style=\"display: none; padding: 15px; border: 1px solid #dee2e6; border-top: none;\">\n", + " \n", + "<div>\n", + " <h2>Descriptive Statistics</h2>\n", + " <div style=\"border: 1px solid #ddd; border-radius: 4px; padding: 10px; margin: 10px 0;\">\n", + " <p>Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's\n", + "dataset.</p>\n", + "<h3>Purpose</h3>\n", + "<p>The purpose of the Descriptive Statistics metric is to provide a comprehensive summary of both numerical and\n", + "categorical data within a dataset. This involves statistics such as count, mean, standard deviation, minimum and\n", + "maximum values for numerical data. For categorical data, it calculates the count, number of unique values, most\n", + "common value and its frequency, and the proportion of the most frequent value relative to the total. The goal is to\n", + "visualize the overall distribution of the variables in the dataset, aiding in understanding the model's behavior\n", + "and predicting its performance.</p>\n", + "<h3>Test Mechanism</h3>\n", + "<p>The testing mechanism utilizes two in-built functions of pandas dataframes: <code>describe()</code> for numerical fields and\n", + "<code>value_counts()</code> for categorical fields. The <code>describe()</code> function pulls out several summary statistics, while\n", + "<code>value_counts()</code> accounts for unique values. The resulting data is formatted into two distinct tables, one for\n", + "numerical and another for categorical variable summaries. These tables provide a clear summary of the main\n", + "characteristics of the variables, which can be instrumental in assessing the model's performance.</p>\n", + "<h3>Signs of High Risk</h3>\n", + "<ul>\n", + "<li>Skewed data or significant outliers can represent high risk. For numerical data, this may be reflected via a\n", + "significant difference between the mean and median (50% percentile).</li>\n", + "<li>For categorical data, a lack of diversity (low count of unique values), or overdominance of a single category\n", + "(high frequency of the top value) can indicate high risk.</li>\n", + "</ul>\n", + "<h3>Strengths</h3>\n", + "<ul>\n", + "<li>Provides a comprehensive summary of the dataset, shedding light on the distribution and characteristics of the\n", + "variables under consideration.</li>\n", + "<li>It is a versatile and robust method, applicable to both numerical and categorical data.</li>\n", + "<li>Helps highlight crucial anomalies such as outliers, extreme skewness, or lack of diversity, which are vital in\n", + "understanding model behavior during testing and validation.</li>\n", + "</ul>\n", + "<h3>Limitations</h3>\n", + "<ul>\n", + "<li>While this metric offers a high-level overview of the data, it may fail to detect subtle correlations or complex\n", + "patterns.</li>\n", + "<li>Does not offer any insights on the relationship between variables.</li>\n", + "<li>Alone, descriptive statistics cannot be used to infer properties about future unseen data.</li>\n", + "<li>Should be used in conjunction with other statistical tests to provide a comprehensive understanding of the\n", + "model's data.</li>\n", + "</ul>\n", + "\n", + " </div>\n", + "</div>\n", + "\n", + "<h4 class=\"vm_required_context\">\n", + " Required Inputs: <span style=\"font-size: 13px\"><i>dataset</i></span>\n", + "</h4>\n", + "\n", + "<div style=\"display: none;\">\n", + " <h4>Parameters:</h4>\n", + " <table class=\"vm_params_table\" style=\"display: none;\">\n", + " <tr>\n", + " <th>Parameter</th>\n", + " <th>Default Value</th>\n", + " </tr>\n", + " \n", + " </table>\n", + "</div>\n", + "\n", + "<div class=\"unset\">\n", + " <h3>How to Run:</h3>\n", + "\n", + " <button\n", + " onclick=\"(() => {e = document.getElementById('expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b'); e.style.display === 'none' ? e.style.display = 'block' : e.style.display = 'none'})()\"\n", + " >Show/Hide Instructions</button>\n", + "\n", + " <div id=\"expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b\" style=\"display: block;\">\n", + " <h4>Code:</h4>\n", + " <pre>\n", + " <code class='language-python'>\n", + "import validmind as vm\n", + "\n", + "# inputs dictionary maps your inputs to the expected input names\n", + "# keys are the expected input names and values are the actual inputs\n", + "# values may be string input_ids or the actual VMDataset or VMModel objects\n", + "inputs = {\n", + " \"dataset\": \"my_vm_dataset\"\n", + "}\n", + "params = {}\n", + "\n", + "# to run and view the result of this test, run the following code:\n", + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics\", inputs=inputs, params=params\n", + ")\n", + "\n", + "# To see the result of the test, ensure that you have called `vm.init()` and then run:\n", + "result.log()</code>\n", + " </pre>\n", + " </div>\n", + "</div>\n", + "\n", + "<style>\n", + "h5.vm_required_context {\n", + " margin-top: 25px;\n", + "}\n", + "table.vm_params_table {\n", + " margin-top: 20px;\n", + " width: 350px;\n", + " border-collapse: collapse;\n", + " border-color: --jp-border-color0;\n", + "}\n", + "table.vm_params_table td, table.vm_params_table th {\n", + " text-align: right;\n", + "}\n", + "table.vm_params_table td:first-child, table.vm_params_table th:first-child {\n", + " text-align: left;\n", + "}\n", + "table.vm_params_table th {\n", + " background-color: --jp-content-color0;\n", + " font-weight: bold;\n", + " font-size: 14px !important;\n", + "}\n", + "table.vm_params_table tr:nth-child(even) {\n", + " background-color: --jp-layout-color1;\n", + "}\n", + "table.vm_params_table tr:nth-child(odd) {\n", + " background-color: --jp-layout-color2;\n", + "}\n", + "table.vm_params_table tr:hover {\n", + " background-color: --jp-layout-color3;\n", + "}\n", + "table.vm_params_table td, table.vm_params_table th {\n", + " padding: 5px;\n", + " border: .8px solid --jp-border-color0;\n", + "}\n", + "</style>\n", + "\n", + " </div>\n", + " </div>\n", + " \n", + " </div>\n", + "\n", + " <script>\n", + " function toggleAccordionItem(itemId) {\n", + " const content = document.getElementById(itemId);\n", + " const toggle = document.getElementById(itemId + '-toggle');\n", + "\n", + " if (content.style.display === 'none' || content.style.display === '') {\n", + " content.style.display = 'block';\n", + " toggle.innerHTML = '▼';\n", + " } else {\n", + " content.style.display = 'none';\n", + " toggle.innerHTML = '▶';\n", + " }\n", + " }\n", + " </script>\n", + " " + ], + "text/plain": [ + "<IPython.core.display.HTML object>" + ] + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you’ve learned how to identify ValidMind test suites relevant to your use cases, we encourage you to explore our interactive notebooks to discover additional tests, learn how to run them, and effectively document your records (models).\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-daee3ccea95b41b4b4bc81230a4a55f5" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" } - ], - "source": [ - "vm.tests.describe_test(\"validmind.data_validation.DescriptiveStatistics\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you’ve learned how to identify ValidMind test suites relevant to your use cases, we encourage you to explore our interactive notebooks to discover additional tests, learn how to run them, and effectively document your records (models).\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-daee3ccea95b41b4b4bc81230a4a55f5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + "nbformat": 4, + "nbformat_minor": 2 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 015777bfe..8df3dd849 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -1,4463 +1,4451 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Explore tests\n", - "\n", - "Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Install the ValidMind Library](#toc2__) \n", - "- [List all available tests](#toc3__) \n", - "- [Understand tags and task types](#toc4__) \n", - "- [Filter tests by tags and task types](#toc5__) \n", - "- [Store test sets for use](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Discover more learning resources](#toc7_1__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## List all available tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Start by importing the functions from the [validmind.tests](https://docs.validmind.ai/validmind/validmind/tests.html) module for listing tests, listing tasks, listing tags, and listing tasks and tags to access these functions in the rest of this notebook:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import (\n", - " list_tests,\n", - " list_tasks,\n", - " list_tags,\n", - " list_tasks_and_tags,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all available ValidMind tests, which returns a DataFrame with the following columns:\n", - "\n", - "- **ID** – A unique identifier for each test.\n", - "- **Name** – The test’s name.\n", - "- **Description** – A short summary of what the test evaluates.\n", - "- **Tags** – Keywords that describe what the test does or applies to.\n", - "- **Tasks** – The type of modeling task the test supports." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + "cells": [ { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_0502a th {\n", - " text-align: left;\n", - "}\n", - "#T_0502a_row0_col0, #T_0502a_row0_col1, #T_0502a_row0_col2, #T_0502a_row0_col3, #T_0502a_row0_col4, #T_0502a_row0_col5, #T_0502a_row0_col6, #T_0502a_row0_col7, #T_0502a_row0_col8, #T_0502a_row1_col0, #T_0502a_row1_col1, #T_0502a_row1_col2, #T_0502a_row1_col3, #T_0502a_row1_col4, #T_0502a_row1_col5, #T_0502a_row1_col6, #T_0502a_row1_col7, #T_0502a_row1_col8, #T_0502a_row2_col0, #T_0502a_row2_col1, #T_0502a_row2_col2, #T_0502a_row2_col3, #T_0502a_row2_col4, #T_0502a_row2_col5, #T_0502a_row2_col6, #T_0502a_row2_col7, #T_0502a_row2_col8, #T_0502a_row3_col0, #T_0502a_row3_col1, #T_0502a_row3_col2, #T_0502a_row3_col3, #T_0502a_row3_col4, #T_0502a_row3_col5, #T_0502a_row3_col6, #T_0502a_row3_col7, #T_0502a_row3_col8, #T_0502a_row4_col0, #T_0502a_row4_col1, #T_0502a_row4_col2, #T_0502a_row4_col3, #T_0502a_row4_col4, #T_0502a_row4_col5, #T_0502a_row4_col6, #T_0502a_row4_col7, #T_0502a_row4_col8, #T_0502a_row5_col0, #T_0502a_row5_col1, #T_0502a_row5_col2, #T_0502a_row5_col3, #T_0502a_row5_col4, #T_0502a_row5_col5, #T_0502a_row5_col6, #T_0502a_row5_col7, #T_0502a_row5_col8, #T_0502a_row6_col0, #T_0502a_row6_col1, #T_0502a_row6_col2, #T_0502a_row6_col3, #T_0502a_row6_col4, #T_0502a_row6_col5, #T_0502a_row6_col6, #T_0502a_row6_col7, #T_0502a_row6_col8, #T_0502a_row7_col0, #T_0502a_row7_col1, #T_0502a_row7_col2, #T_0502a_row7_col3, #T_0502a_row7_col4, #T_0502a_row7_col5, #T_0502a_row7_col6, #T_0502a_row7_col7, #T_0502a_row7_col8, #T_0502a_row8_col0, #T_0502a_row8_col1, #T_0502a_row8_col2, #T_0502a_row8_col3, #T_0502a_row8_col4, #T_0502a_row8_col5, #T_0502a_row8_col6, #T_0502a_row8_col7, #T_0502a_row8_col8, #T_0502a_row9_col0, #T_0502a_row9_col1, #T_0502a_row9_col2, #T_0502a_row9_col3, #T_0502a_row9_col4, #T_0502a_row9_col5, #T_0502a_row9_col6, #T_0502a_row9_col7, #T_0502a_row9_col8, #T_0502a_row10_col0, #T_0502a_row10_col1, #T_0502a_row10_col2, #T_0502a_row10_col3, #T_0502a_row10_col4, #T_0502a_row10_col5, #T_0502a_row10_col6, #T_0502a_row10_col7, #T_0502a_row10_col8, #T_0502a_row11_col0, #T_0502a_row11_col1, #T_0502a_row11_col2, #T_0502a_row11_col3, #T_0502a_row11_col4, #T_0502a_row11_col5, #T_0502a_row11_col6, #T_0502a_row11_col7, #T_0502a_row11_col8, #T_0502a_row12_col0, #T_0502a_row12_col1, #T_0502a_row12_col2, #T_0502a_row12_col3, #T_0502a_row12_col4, #T_0502a_row12_col5, #T_0502a_row12_col6, #T_0502a_row12_col7, #T_0502a_row12_col8, #T_0502a_row13_col0, #T_0502a_row13_col1, #T_0502a_row13_col2, #T_0502a_row13_col3, #T_0502a_row13_col4, #T_0502a_row13_col5, #T_0502a_row13_col6, #T_0502a_row13_col7, #T_0502a_row13_col8, #T_0502a_row14_col0, #T_0502a_row14_col1, #T_0502a_row14_col2, #T_0502a_row14_col3, #T_0502a_row14_col4, #T_0502a_row14_col5, #T_0502a_row14_col6, #T_0502a_row14_col7, #T_0502a_row14_col8, #T_0502a_row15_col0, #T_0502a_row15_col1, #T_0502a_row15_col2, #T_0502a_row15_col3, #T_0502a_row15_col4, #T_0502a_row15_col5, #T_0502a_row15_col6, #T_0502a_row15_col7, #T_0502a_row15_col8, #T_0502a_row16_col0, #T_0502a_row16_col1, #T_0502a_row16_col2, #T_0502a_row16_col3, #T_0502a_row16_col4, #T_0502a_row16_col5, #T_0502a_row16_col6, #T_0502a_row16_col7, #T_0502a_row16_col8, #T_0502a_row17_col0, #T_0502a_row17_col1, #T_0502a_row17_col2, #T_0502a_row17_col3, #T_0502a_row17_col4, #T_0502a_row17_col5, #T_0502a_row17_col6, #T_0502a_row17_col7, #T_0502a_row17_col8, #T_0502a_row18_col0, #T_0502a_row18_col1, #T_0502a_row18_col2, #T_0502a_row18_col3, #T_0502a_row18_col4, #T_0502a_row18_col5, #T_0502a_row18_col6, #T_0502a_row18_col7, #T_0502a_row18_col8, #T_0502a_row19_col0, #T_0502a_row19_col1, #T_0502a_row19_col2, #T_0502a_row19_col3, #T_0502a_row19_col4, #T_0502a_row19_col5, #T_0502a_row19_col6, #T_0502a_row19_col7, #T_0502a_row19_col8, #T_0502a_row20_col0, #T_0502a_row20_col1, #T_0502a_row20_col2, #T_0502a_row20_col3, #T_0502a_row20_col4, #T_0502a_row20_col5, #T_0502a_row20_col6, #T_0502a_row20_col7, #T_0502a_row20_col8, #T_0502a_row21_col0, #T_0502a_row21_col1, #T_0502a_row21_col2, #T_0502a_row21_col3, #T_0502a_row21_col4, #T_0502a_row21_col5, #T_0502a_row21_col6, #T_0502a_row21_col7, #T_0502a_row21_col8, #T_0502a_row22_col0, #T_0502a_row22_col1, #T_0502a_row22_col2, #T_0502a_row22_col3, #T_0502a_row22_col4, #T_0502a_row22_col5, #T_0502a_row22_col6, #T_0502a_row22_col7, #T_0502a_row22_col8, #T_0502a_row23_col0, #T_0502a_row23_col1, #T_0502a_row23_col2, #T_0502a_row23_col3, #T_0502a_row23_col4, #T_0502a_row23_col5, #T_0502a_row23_col6, #T_0502a_row23_col7, #T_0502a_row23_col8, #T_0502a_row24_col0, #T_0502a_row24_col1, #T_0502a_row24_col2, #T_0502a_row24_col3, #T_0502a_row24_col4, #T_0502a_row24_col5, #T_0502a_row24_col6, #T_0502a_row24_col7, #T_0502a_row24_col8, #T_0502a_row25_col0, #T_0502a_row25_col1, #T_0502a_row25_col2, #T_0502a_row25_col3, #T_0502a_row25_col4, #T_0502a_row25_col5, #T_0502a_row25_col6, #T_0502a_row25_col7, #T_0502a_row25_col8, #T_0502a_row26_col0, #T_0502a_row26_col1, #T_0502a_row26_col2, #T_0502a_row26_col3, #T_0502a_row26_col4, #T_0502a_row26_col5, #T_0502a_row26_col6, #T_0502a_row26_col7, #T_0502a_row26_col8, #T_0502a_row27_col0, #T_0502a_row27_col1, #T_0502a_row27_col2, #T_0502a_row27_col3, #T_0502a_row27_col4, #T_0502a_row27_col5, #T_0502a_row27_col6, #T_0502a_row27_col7, #T_0502a_row27_col8, #T_0502a_row28_col0, #T_0502a_row28_col1, #T_0502a_row28_col2, #T_0502a_row28_col3, #T_0502a_row28_col4, #T_0502a_row28_col5, #T_0502a_row28_col6, #T_0502a_row28_col7, #T_0502a_row28_col8, #T_0502a_row29_col0, #T_0502a_row29_col1, #T_0502a_row29_col2, #T_0502a_row29_col3, #T_0502a_row29_col4, #T_0502a_row29_col5, #T_0502a_row29_col6, #T_0502a_row29_col7, #T_0502a_row29_col8, #T_0502a_row30_col0, #T_0502a_row30_col1, #T_0502a_row30_col2, #T_0502a_row30_col3, #T_0502a_row30_col4, #T_0502a_row30_col5, #T_0502a_row30_col6, #T_0502a_row30_col7, #T_0502a_row30_col8, #T_0502a_row31_col0, #T_0502a_row31_col1, #T_0502a_row31_col2, #T_0502a_row31_col3, #T_0502a_row31_col4, #T_0502a_row31_col5, #T_0502a_row31_col6, #T_0502a_row31_col7, #T_0502a_row31_col8, #T_0502a_row32_col0, #T_0502a_row32_col1, #T_0502a_row32_col2, #T_0502a_row32_col3, #T_0502a_row32_col4, #T_0502a_row32_col5, #T_0502a_row32_col6, #T_0502a_row32_col7, #T_0502a_row32_col8, #T_0502a_row33_col0, #T_0502a_row33_col1, #T_0502a_row33_col2, #T_0502a_row33_col3, #T_0502a_row33_col4, #T_0502a_row33_col5, #T_0502a_row33_col6, #T_0502a_row33_col7, #T_0502a_row33_col8, #T_0502a_row34_col0, #T_0502a_row34_col1, #T_0502a_row34_col2, #T_0502a_row34_col3, #T_0502a_row34_col4, #T_0502a_row34_col5, #T_0502a_row34_col6, #T_0502a_row34_col7, #T_0502a_row34_col8, #T_0502a_row35_col0, #T_0502a_row35_col1, #T_0502a_row35_col2, #T_0502a_row35_col3, #T_0502a_row35_col4, #T_0502a_row35_col5, #T_0502a_row35_col6, #T_0502a_row35_col7, #T_0502a_row35_col8, #T_0502a_row36_col0, #T_0502a_row36_col1, #T_0502a_row36_col2, #T_0502a_row36_col3, #T_0502a_row36_col4, #T_0502a_row36_col5, #T_0502a_row36_col6, #T_0502a_row36_col7, #T_0502a_row36_col8, #T_0502a_row37_col0, #T_0502a_row37_col1, #T_0502a_row37_col2, #T_0502a_row37_col3, #T_0502a_row37_col4, #T_0502a_row37_col5, #T_0502a_row37_col6, #T_0502a_row37_col7, #T_0502a_row37_col8, #T_0502a_row38_col0, #T_0502a_row38_col1, #T_0502a_row38_col2, #T_0502a_row38_col3, #T_0502a_row38_col4, #T_0502a_row38_col5, #T_0502a_row38_col6, #T_0502a_row38_col7, #T_0502a_row38_col8, #T_0502a_row39_col0, #T_0502a_row39_col1, #T_0502a_row39_col2, #T_0502a_row39_col3, #T_0502a_row39_col4, #T_0502a_row39_col5, #T_0502a_row39_col6, #T_0502a_row39_col7, #T_0502a_row39_col8, #T_0502a_row40_col0, #T_0502a_row40_col1, #T_0502a_row40_col2, #T_0502a_row40_col3, #T_0502a_row40_col4, #T_0502a_row40_col5, #T_0502a_row40_col6, #T_0502a_row40_col7, #T_0502a_row40_col8, #T_0502a_row41_col0, #T_0502a_row41_col1, #T_0502a_row41_col2, #T_0502a_row41_col3, #T_0502a_row41_col4, #T_0502a_row41_col5, #T_0502a_row41_col6, #T_0502a_row41_col7, #T_0502a_row41_col8, #T_0502a_row42_col0, #T_0502a_row42_col1, #T_0502a_row42_col2, #T_0502a_row42_col3, #T_0502a_row42_col4, #T_0502a_row42_col5, #T_0502a_row42_col6, #T_0502a_row42_col7, #T_0502a_row42_col8, #T_0502a_row43_col0, #T_0502a_row43_col1, #T_0502a_row43_col2, #T_0502a_row43_col3, #T_0502a_row43_col4, #T_0502a_row43_col5, #T_0502a_row43_col6, #T_0502a_row43_col7, #T_0502a_row43_col8, #T_0502a_row44_col0, #T_0502a_row44_col1, #T_0502a_row44_col2, #T_0502a_row44_col3, #T_0502a_row44_col4, #T_0502a_row44_col5, #T_0502a_row44_col6, #T_0502a_row44_col7, #T_0502a_row44_col8, #T_0502a_row45_col0, #T_0502a_row45_col1, #T_0502a_row45_col2, #T_0502a_row45_col3, #T_0502a_row45_col4, #T_0502a_row45_col5, #T_0502a_row45_col6, #T_0502a_row45_col7, #T_0502a_row45_col8, #T_0502a_row46_col0, #T_0502a_row46_col1, #T_0502a_row46_col2, #T_0502a_row46_col3, #T_0502a_row46_col4, #T_0502a_row46_col5, #T_0502a_row46_col6, #T_0502a_row46_col7, #T_0502a_row46_col8, #T_0502a_row47_col0, #T_0502a_row47_col1, #T_0502a_row47_col2, #T_0502a_row47_col3, #T_0502a_row47_col4, #T_0502a_row47_col5, #T_0502a_row47_col6, #T_0502a_row47_col7, #T_0502a_row47_col8, #T_0502a_row48_col0, #T_0502a_row48_col1, #T_0502a_row48_col2, #T_0502a_row48_col3, #T_0502a_row48_col4, #T_0502a_row48_col5, #T_0502a_row48_col6, #T_0502a_row48_col7, #T_0502a_row48_col8, #T_0502a_row49_col0, #T_0502a_row49_col1, #T_0502a_row49_col2, #T_0502a_row49_col3, #T_0502a_row49_col4, #T_0502a_row49_col5, #T_0502a_row49_col6, #T_0502a_row49_col7, #T_0502a_row49_col8, #T_0502a_row50_col0, #T_0502a_row50_col1, #T_0502a_row50_col2, #T_0502a_row50_col3, #T_0502a_row50_col4, #T_0502a_row50_col5, #T_0502a_row50_col6, #T_0502a_row50_col7, #T_0502a_row50_col8, #T_0502a_row51_col0, #T_0502a_row51_col1, #T_0502a_row51_col2, #T_0502a_row51_col3, #T_0502a_row51_col4, #T_0502a_row51_col5, #T_0502a_row51_col6, #T_0502a_row51_col7, #T_0502a_row51_col8, #T_0502a_row52_col0, #T_0502a_row52_col1, #T_0502a_row52_col2, #T_0502a_row52_col3, #T_0502a_row52_col4, #T_0502a_row52_col5, #T_0502a_row52_col6, #T_0502a_row52_col7, #T_0502a_row52_col8, #T_0502a_row53_col0, #T_0502a_row53_col1, #T_0502a_row53_col2, #T_0502a_row53_col3, #T_0502a_row53_col4, #T_0502a_row53_col5, #T_0502a_row53_col6, #T_0502a_row53_col7, #T_0502a_row53_col8, #T_0502a_row54_col0, #T_0502a_row54_col1, #T_0502a_row54_col2, #T_0502a_row54_col3, #T_0502a_row54_col4, #T_0502a_row54_col5, #T_0502a_row54_col6, #T_0502a_row54_col7, #T_0502a_row54_col8, #T_0502a_row55_col0, #T_0502a_row55_col1, #T_0502a_row55_col2, #T_0502a_row55_col3, #T_0502a_row55_col4, #T_0502a_row55_col5, #T_0502a_row55_col6, #T_0502a_row55_col7, #T_0502a_row55_col8, #T_0502a_row56_col0, #T_0502a_row56_col1, #T_0502a_row56_col2, #T_0502a_row56_col3, #T_0502a_row56_col4, #T_0502a_row56_col5, #T_0502a_row56_col6, #T_0502a_row56_col7, #T_0502a_row56_col8, #T_0502a_row57_col0, #T_0502a_row57_col1, #T_0502a_row57_col2, #T_0502a_row57_col3, #T_0502a_row57_col4, #T_0502a_row57_col5, #T_0502a_row57_col6, #T_0502a_row57_col7, #T_0502a_row57_col8, #T_0502a_row58_col0, #T_0502a_row58_col1, #T_0502a_row58_col2, #T_0502a_row58_col3, #T_0502a_row58_col4, #T_0502a_row58_col5, #T_0502a_row58_col6, #T_0502a_row58_col7, #T_0502a_row58_col8, #T_0502a_row59_col0, #T_0502a_row59_col1, #T_0502a_row59_col2, #T_0502a_row59_col3, #T_0502a_row59_col4, #T_0502a_row59_col5, #T_0502a_row59_col6, #T_0502a_row59_col7, #T_0502a_row59_col8, #T_0502a_row60_col0, #T_0502a_row60_col1, #T_0502a_row60_col2, #T_0502a_row60_col3, #T_0502a_row60_col4, #T_0502a_row60_col5, #T_0502a_row60_col6, #T_0502a_row60_col7, #T_0502a_row60_col8, #T_0502a_row61_col0, #T_0502a_row61_col1, #T_0502a_row61_col2, #T_0502a_row61_col3, #T_0502a_row61_col4, #T_0502a_row61_col5, #T_0502a_row61_col6, #T_0502a_row61_col7, #T_0502a_row61_col8, #T_0502a_row62_col0, #T_0502a_row62_col1, #T_0502a_row62_col2, #T_0502a_row62_col3, #T_0502a_row62_col4, #T_0502a_row62_col5, #T_0502a_row62_col6, #T_0502a_row62_col7, #T_0502a_row62_col8, #T_0502a_row63_col0, #T_0502a_row63_col1, #T_0502a_row63_col2, #T_0502a_row63_col3, #T_0502a_row63_col4, #T_0502a_row63_col5, #T_0502a_row63_col6, #T_0502a_row63_col7, #T_0502a_row63_col8, #T_0502a_row64_col0, #T_0502a_row64_col1, #T_0502a_row64_col2, #T_0502a_row64_col3, #T_0502a_row64_col4, #T_0502a_row64_col5, #T_0502a_row64_col6, #T_0502a_row64_col7, #T_0502a_row64_col8, #T_0502a_row65_col0, #T_0502a_row65_col1, #T_0502a_row65_col2, #T_0502a_row65_col3, #T_0502a_row65_col4, #T_0502a_row65_col5, #T_0502a_row65_col6, #T_0502a_row65_col7, #T_0502a_row65_col8, #T_0502a_row66_col0, #T_0502a_row66_col1, #T_0502a_row66_col2, #T_0502a_row66_col3, #T_0502a_row66_col4, #T_0502a_row66_col5, #T_0502a_row66_col6, #T_0502a_row66_col7, #T_0502a_row66_col8, #T_0502a_row67_col0, #T_0502a_row67_col1, #T_0502a_row67_col2, #T_0502a_row67_col3, #T_0502a_row67_col4, #T_0502a_row67_col5, #T_0502a_row67_col6, #T_0502a_row67_col7, #T_0502a_row67_col8, #T_0502a_row68_col0, #T_0502a_row68_col1, #T_0502a_row68_col2, #T_0502a_row68_col3, #T_0502a_row68_col4, #T_0502a_row68_col5, #T_0502a_row68_col6, #T_0502a_row68_col7, #T_0502a_row68_col8, #T_0502a_row69_col0, #T_0502a_row69_col1, #T_0502a_row69_col2, #T_0502a_row69_col3, #T_0502a_row69_col4, #T_0502a_row69_col5, #T_0502a_row69_col6, #T_0502a_row69_col7, #T_0502a_row69_col8, #T_0502a_row70_col0, #T_0502a_row70_col1, #T_0502a_row70_col2, #T_0502a_row70_col3, #T_0502a_row70_col4, #T_0502a_row70_col5, #T_0502a_row70_col6, #T_0502a_row70_col7, #T_0502a_row70_col8, #T_0502a_row71_col0, #T_0502a_row71_col1, #T_0502a_row71_col2, #T_0502a_row71_col3, #T_0502a_row71_col4, #T_0502a_row71_col5, #T_0502a_row71_col6, #T_0502a_row71_col7, #T_0502a_row71_col8, #T_0502a_row72_col0, #T_0502a_row72_col1, #T_0502a_row72_col2, #T_0502a_row72_col3, #T_0502a_row72_col4, #T_0502a_row72_col5, #T_0502a_row72_col6, #T_0502a_row72_col7, #T_0502a_row72_col8, #T_0502a_row73_col0, #T_0502a_row73_col1, #T_0502a_row73_col2, #T_0502a_row73_col3, #T_0502a_row73_col4, #T_0502a_row73_col5, #T_0502a_row73_col6, #T_0502a_row73_col7, #T_0502a_row73_col8, #T_0502a_row74_col0, #T_0502a_row74_col1, #T_0502a_row74_col2, #T_0502a_row74_col3, #T_0502a_row74_col4, #T_0502a_row74_col5, #T_0502a_row74_col6, #T_0502a_row74_col7, #T_0502a_row74_col8, #T_0502a_row75_col0, #T_0502a_row75_col1, #T_0502a_row75_col2, #T_0502a_row75_col3, #T_0502a_row75_col4, #T_0502a_row75_col5, #T_0502a_row75_col6, #T_0502a_row75_col7, #T_0502a_row75_col8, #T_0502a_row76_col0, #T_0502a_row76_col1, #T_0502a_row76_col2, #T_0502a_row76_col3, #T_0502a_row76_col4, #T_0502a_row76_col5, #T_0502a_row76_col6, #T_0502a_row76_col7, #T_0502a_row76_col8, #T_0502a_row77_col0, #T_0502a_row77_col1, #T_0502a_row77_col2, #T_0502a_row77_col3, #T_0502a_row77_col4, #T_0502a_row77_col5, #T_0502a_row77_col6, #T_0502a_row77_col7, #T_0502a_row77_col8, #T_0502a_row78_col0, #T_0502a_row78_col1, #T_0502a_row78_col2, #T_0502a_row78_col3, #T_0502a_row78_col4, #T_0502a_row78_col5, #T_0502a_row78_col6, #T_0502a_row78_col7, #T_0502a_row78_col8, #T_0502a_row79_col0, #T_0502a_row79_col1, #T_0502a_row79_col2, #T_0502a_row79_col3, #T_0502a_row79_col4, #T_0502a_row79_col5, #T_0502a_row79_col6, #T_0502a_row79_col7, #T_0502a_row79_col8, #T_0502a_row80_col0, #T_0502a_row80_col1, #T_0502a_row80_col2, #T_0502a_row80_col3, #T_0502a_row80_col4, #T_0502a_row80_col5, #T_0502a_row80_col6, #T_0502a_row80_col7, #T_0502a_row80_col8, #T_0502a_row81_col0, #T_0502a_row81_col1, #T_0502a_row81_col2, #T_0502a_row81_col3, #T_0502a_row81_col4, #T_0502a_row81_col5, #T_0502a_row81_col6, #T_0502a_row81_col7, #T_0502a_row81_col8, #T_0502a_row82_col0, #T_0502a_row82_col1, #T_0502a_row82_col2, #T_0502a_row82_col3, #T_0502a_row82_col4, #T_0502a_row82_col5, #T_0502a_row82_col6, #T_0502a_row82_col7, #T_0502a_row82_col8, #T_0502a_row83_col0, #T_0502a_row83_col1, #T_0502a_row83_col2, #T_0502a_row83_col3, #T_0502a_row83_col4, #T_0502a_row83_col5, #T_0502a_row83_col6, #T_0502a_row83_col7, #T_0502a_row83_col8, #T_0502a_row84_col0, #T_0502a_row84_col1, #T_0502a_row84_col2, #T_0502a_row84_col3, #T_0502a_row84_col4, #T_0502a_row84_col5, #T_0502a_row84_col6, #T_0502a_row84_col7, #T_0502a_row84_col8, #T_0502a_row85_col0, #T_0502a_row85_col1, #T_0502a_row85_col2, #T_0502a_row85_col3, #T_0502a_row85_col4, #T_0502a_row85_col5, #T_0502a_row85_col6, #T_0502a_row85_col7, #T_0502a_row85_col8, #T_0502a_row86_col0, #T_0502a_row86_col1, #T_0502a_row86_col2, #T_0502a_row86_col3, #T_0502a_row86_col4, #T_0502a_row86_col5, #T_0502a_row86_col6, #T_0502a_row86_col7, #T_0502a_row86_col8, #T_0502a_row87_col0, #T_0502a_row87_col1, #T_0502a_row87_col2, #T_0502a_row87_col3, #T_0502a_row87_col4, #T_0502a_row87_col5, #T_0502a_row87_col6, #T_0502a_row87_col7, #T_0502a_row87_col8, #T_0502a_row88_col0, #T_0502a_row88_col1, #T_0502a_row88_col2, #T_0502a_row88_col3, #T_0502a_row88_col4, #T_0502a_row88_col5, #T_0502a_row88_col6, #T_0502a_row88_col7, #T_0502a_row88_col8, #T_0502a_row89_col0, #T_0502a_row89_col1, #T_0502a_row89_col2, #T_0502a_row89_col3, #T_0502a_row89_col4, #T_0502a_row89_col5, #T_0502a_row89_col6, #T_0502a_row89_col7, #T_0502a_row89_col8, #T_0502a_row90_col0, #T_0502a_row90_col1, #T_0502a_row90_col2, #T_0502a_row90_col3, #T_0502a_row90_col4, #T_0502a_row90_col5, #T_0502a_row90_col6, #T_0502a_row90_col7, #T_0502a_row90_col8, #T_0502a_row91_col0, #T_0502a_row91_col1, #T_0502a_row91_col2, #T_0502a_row91_col3, #T_0502a_row91_col4, #T_0502a_row91_col5, #T_0502a_row91_col6, #T_0502a_row91_col7, #T_0502a_row91_col8, #T_0502a_row92_col0, #T_0502a_row92_col1, #T_0502a_row92_col2, #T_0502a_row92_col3, #T_0502a_row92_col4, #T_0502a_row92_col5, #T_0502a_row92_col6, #T_0502a_row92_col7, #T_0502a_row92_col8, #T_0502a_row93_col0, #T_0502a_row93_col1, #T_0502a_row93_col2, #T_0502a_row93_col3, #T_0502a_row93_col4, #T_0502a_row93_col5, #T_0502a_row93_col6, #T_0502a_row93_col7, #T_0502a_row93_col8, #T_0502a_row94_col0, #T_0502a_row94_col1, #T_0502a_row94_col2, #T_0502a_row94_col3, #T_0502a_row94_col4, #T_0502a_row94_col5, #T_0502a_row94_col6, #T_0502a_row94_col7, #T_0502a_row94_col8, #T_0502a_row95_col0, #T_0502a_row95_col1, #T_0502a_row95_col2, #T_0502a_row95_col3, #T_0502a_row95_col4, #T_0502a_row95_col5, #T_0502a_row95_col6, #T_0502a_row95_col7, #T_0502a_row95_col8, #T_0502a_row96_col0, #T_0502a_row96_col1, #T_0502a_row96_col2, #T_0502a_row96_col3, #T_0502a_row96_col4, #T_0502a_row96_col5, #T_0502a_row96_col6, #T_0502a_row96_col7, #T_0502a_row96_col8, #T_0502a_row97_col0, #T_0502a_row97_col1, #T_0502a_row97_col2, #T_0502a_row97_col3, #T_0502a_row97_col4, #T_0502a_row97_col5, #T_0502a_row97_col6, #T_0502a_row97_col7, #T_0502a_row97_col8, #T_0502a_row98_col0, #T_0502a_row98_col1, #T_0502a_row98_col2, #T_0502a_row98_col3, #T_0502a_row98_col4, #T_0502a_row98_col5, #T_0502a_row98_col6, #T_0502a_row98_col7, #T_0502a_row98_col8, #T_0502a_row99_col0, #T_0502a_row99_col1, #T_0502a_row99_col2, #T_0502a_row99_col3, #T_0502a_row99_col4, #T_0502a_row99_col5, #T_0502a_row99_col6, #T_0502a_row99_col7, #T_0502a_row99_col8, #T_0502a_row100_col0, #T_0502a_row100_col1, #T_0502a_row100_col2, #T_0502a_row100_col3, #T_0502a_row100_col4, #T_0502a_row100_col5, #T_0502a_row100_col6, #T_0502a_row100_col7, #T_0502a_row100_col8, #T_0502a_row101_col0, #T_0502a_row101_col1, #T_0502a_row101_col2, #T_0502a_row101_col3, #T_0502a_row101_col4, #T_0502a_row101_col5, #T_0502a_row101_col6, #T_0502a_row101_col7, #T_0502a_row101_col8, #T_0502a_row102_col0, #T_0502a_row102_col1, #T_0502a_row102_col2, #T_0502a_row102_col3, #T_0502a_row102_col4, #T_0502a_row102_col5, #T_0502a_row102_col6, #T_0502a_row102_col7, #T_0502a_row102_col8, #T_0502a_row103_col0, #T_0502a_row103_col1, #T_0502a_row103_col2, #T_0502a_row103_col3, #T_0502a_row103_col4, #T_0502a_row103_col5, #T_0502a_row103_col6, #T_0502a_row103_col7, #T_0502a_row103_col8, #T_0502a_row104_col0, #T_0502a_row104_col1, #T_0502a_row104_col2, #T_0502a_row104_col3, #T_0502a_row104_col4, #T_0502a_row104_col5, #T_0502a_row104_col6, #T_0502a_row104_col7, #T_0502a_row104_col8, #T_0502a_row105_col0, #T_0502a_row105_col1, #T_0502a_row105_col2, #T_0502a_row105_col3, #T_0502a_row105_col4, #T_0502a_row105_col5, #T_0502a_row105_col6, #T_0502a_row105_col7, #T_0502a_row105_col8, #T_0502a_row106_col0, #T_0502a_row106_col1, #T_0502a_row106_col2, #T_0502a_row106_col3, #T_0502a_row106_col4, #T_0502a_row106_col5, #T_0502a_row106_col6, #T_0502a_row106_col7, #T_0502a_row106_col8, #T_0502a_row107_col0, #T_0502a_row107_col1, #T_0502a_row107_col2, #T_0502a_row107_col3, #T_0502a_row107_col4, #T_0502a_row107_col5, #T_0502a_row107_col6, #T_0502a_row107_col7, #T_0502a_row107_col8, #T_0502a_row108_col0, #T_0502a_row108_col1, #T_0502a_row108_col2, #T_0502a_row108_col3, #T_0502a_row108_col4, #T_0502a_row108_col5, #T_0502a_row108_col6, #T_0502a_row108_col7, #T_0502a_row108_col8, #T_0502a_row109_col0, #T_0502a_row109_col1, #T_0502a_row109_col2, #T_0502a_row109_col3, #T_0502a_row109_col4, #T_0502a_row109_col5, #T_0502a_row109_col6, #T_0502a_row109_col7, #T_0502a_row109_col8, #T_0502a_row110_col0, #T_0502a_row110_col1, #T_0502a_row110_col2, #T_0502a_row110_col3, #T_0502a_row110_col4, #T_0502a_row110_col5, #T_0502a_row110_col6, #T_0502a_row110_col7, #T_0502a_row110_col8, #T_0502a_row111_col0, #T_0502a_row111_col1, #T_0502a_row111_col2, #T_0502a_row111_col3, #T_0502a_row111_col4, #T_0502a_row111_col5, #T_0502a_row111_col6, #T_0502a_row111_col7, #T_0502a_row111_col8, #T_0502a_row112_col0, #T_0502a_row112_col1, #T_0502a_row112_col2, #T_0502a_row112_col3, #T_0502a_row112_col4, #T_0502a_row112_col5, #T_0502a_row112_col6, #T_0502a_row112_col7, #T_0502a_row112_col8, #T_0502a_row113_col0, #T_0502a_row113_col1, #T_0502a_row113_col2, #T_0502a_row113_col3, #T_0502a_row113_col4, #T_0502a_row113_col5, #T_0502a_row113_col6, #T_0502a_row113_col7, #T_0502a_row113_col8, #T_0502a_row114_col0, #T_0502a_row114_col1, #T_0502a_row114_col2, #T_0502a_row114_col3, #T_0502a_row114_col4, #T_0502a_row114_col5, #T_0502a_row114_col6, #T_0502a_row114_col7, #T_0502a_row114_col8, #T_0502a_row115_col0, #T_0502a_row115_col1, #T_0502a_row115_col2, #T_0502a_row115_col3, #T_0502a_row115_col4, #T_0502a_row115_col5, #T_0502a_row115_col6, #T_0502a_row115_col7, #T_0502a_row115_col8, #T_0502a_row116_col0, #T_0502a_row116_col1, #T_0502a_row116_col2, #T_0502a_row116_col3, #T_0502a_row116_col4, #T_0502a_row116_col5, #T_0502a_row116_col6, #T_0502a_row116_col7, #T_0502a_row116_col8, #T_0502a_row117_col0, #T_0502a_row117_col1, #T_0502a_row117_col2, #T_0502a_row117_col3, #T_0502a_row117_col4, #T_0502a_row117_col5, #T_0502a_row117_col6, #T_0502a_row117_col7, #T_0502a_row117_col8, #T_0502a_row118_col0, #T_0502a_row118_col1, #T_0502a_row118_col2, #T_0502a_row118_col3, #T_0502a_row118_col4, #T_0502a_row118_col5, #T_0502a_row118_col6, #T_0502a_row118_col7, #T_0502a_row118_col8, #T_0502a_row119_col0, #T_0502a_row119_col1, #T_0502a_row119_col2, #T_0502a_row119_col3, #T_0502a_row119_col4, #T_0502a_row119_col5, #T_0502a_row119_col6, #T_0502a_row119_col7, #T_0502a_row119_col8, #T_0502a_row120_col0, #T_0502a_row120_col1, #T_0502a_row120_col2, #T_0502a_row120_col3, #T_0502a_row120_col4, #T_0502a_row120_col5, #T_0502a_row120_col6, #T_0502a_row120_col7, #T_0502a_row120_col8, #T_0502a_row121_col0, #T_0502a_row121_col1, #T_0502a_row121_col2, #T_0502a_row121_col3, #T_0502a_row121_col4, #T_0502a_row121_col5, #T_0502a_row121_col6, #T_0502a_row121_col7, #T_0502a_row121_col8, #T_0502a_row122_col0, #T_0502a_row122_col1, #T_0502a_row122_col2, #T_0502a_row122_col3, #T_0502a_row122_col4, #T_0502a_row122_col5, #T_0502a_row122_col6, #T_0502a_row122_col7, #T_0502a_row122_col8, #T_0502a_row123_col0, #T_0502a_row123_col1, #T_0502a_row123_col2, #T_0502a_row123_col3, #T_0502a_row123_col4, #T_0502a_row123_col5, #T_0502a_row123_col6, #T_0502a_row123_col7, #T_0502a_row123_col8, #T_0502a_row124_col0, #T_0502a_row124_col1, #T_0502a_row124_col2, #T_0502a_row124_col3, #T_0502a_row124_col4, #T_0502a_row124_col5, #T_0502a_row124_col6, #T_0502a_row124_col7, #T_0502a_row124_col8, #T_0502a_row125_col0, #T_0502a_row125_col1, #T_0502a_row125_col2, #T_0502a_row125_col3, #T_0502a_row125_col4, #T_0502a_row125_col5, #T_0502a_row125_col6, #T_0502a_row125_col7, #T_0502a_row125_col8, #T_0502a_row126_col0, #T_0502a_row126_col1, #T_0502a_row126_col2, #T_0502a_row126_col3, #T_0502a_row126_col4, #T_0502a_row126_col5, #T_0502a_row126_col6, #T_0502a_row126_col7, #T_0502a_row126_col8, #T_0502a_row127_col0, #T_0502a_row127_col1, #T_0502a_row127_col2, #T_0502a_row127_col3, #T_0502a_row127_col4, #T_0502a_row127_col5, #T_0502a_row127_col6, #T_0502a_row127_col7, #T_0502a_row127_col8, #T_0502a_row128_col0, #T_0502a_row128_col1, #T_0502a_row128_col2, #T_0502a_row128_col3, #T_0502a_row128_col4, #T_0502a_row128_col5, #T_0502a_row128_col6, #T_0502a_row128_col7, #T_0502a_row128_col8, #T_0502a_row129_col0, #T_0502a_row129_col1, #T_0502a_row129_col2, #T_0502a_row129_col3, #T_0502a_row129_col4, #T_0502a_row129_col5, #T_0502a_row129_col6, #T_0502a_row129_col7, #T_0502a_row129_col8, #T_0502a_row130_col0, #T_0502a_row130_col1, #T_0502a_row130_col2, #T_0502a_row130_col3, #T_0502a_row130_col4, #T_0502a_row130_col5, #T_0502a_row130_col6, #T_0502a_row130_col7, #T_0502a_row130_col8, #T_0502a_row131_col0, #T_0502a_row131_col1, #T_0502a_row131_col2, #T_0502a_row131_col3, #T_0502a_row131_col4, #T_0502a_row131_col5, #T_0502a_row131_col6, #T_0502a_row131_col7, #T_0502a_row131_col8, #T_0502a_row132_col0, #T_0502a_row132_col1, #T_0502a_row132_col2, #T_0502a_row132_col3, #T_0502a_row132_col4, #T_0502a_row132_col5, #T_0502a_row132_col6, #T_0502a_row132_col7, #T_0502a_row132_col8, #T_0502a_row133_col0, #T_0502a_row133_col1, #T_0502a_row133_col2, #T_0502a_row133_col3, #T_0502a_row133_col4, #T_0502a_row133_col5, #T_0502a_row133_col6, #T_0502a_row133_col7, #T_0502a_row133_col8, #T_0502a_row134_col0, #T_0502a_row134_col1, #T_0502a_row134_col2, #T_0502a_row134_col3, #T_0502a_row134_col4, #T_0502a_row134_col5, #T_0502a_row134_col6, #T_0502a_row134_col7, #T_0502a_row134_col8, #T_0502a_row135_col0, #T_0502a_row135_col1, #T_0502a_row135_col2, #T_0502a_row135_col3, #T_0502a_row135_col4, #T_0502a_row135_col5, #T_0502a_row135_col6, #T_0502a_row135_col7, #T_0502a_row135_col8, #T_0502a_row136_col0, #T_0502a_row136_col1, #T_0502a_row136_col2, #T_0502a_row136_col3, #T_0502a_row136_col4, #T_0502a_row136_col5, #T_0502a_row136_col6, #T_0502a_row136_col7, #T_0502a_row136_col8, #T_0502a_row137_col0, #T_0502a_row137_col1, #T_0502a_row137_col2, #T_0502a_row137_col3, #T_0502a_row137_col4, #T_0502a_row137_col5, #T_0502a_row137_col6, #T_0502a_row137_col7, #T_0502a_row137_col8, #T_0502a_row138_col0, #T_0502a_row138_col1, #T_0502a_row138_col2, #T_0502a_row138_col3, #T_0502a_row138_col4, #T_0502a_row138_col5, #T_0502a_row138_col6, #T_0502a_row138_col7, #T_0502a_row138_col8, #T_0502a_row139_col0, #T_0502a_row139_col1, #T_0502a_row139_col2, #T_0502a_row139_col3, #T_0502a_row139_col4, #T_0502a_row139_col5, #T_0502a_row139_col6, #T_0502a_row139_col7, #T_0502a_row139_col8, #T_0502a_row140_col0, #T_0502a_row140_col1, #T_0502a_row140_col2, #T_0502a_row140_col3, #T_0502a_row140_col4, #T_0502a_row140_col5, #T_0502a_row140_col6, #T_0502a_row140_col7, #T_0502a_row140_col8, #T_0502a_row141_col0, #T_0502a_row141_col1, #T_0502a_row141_col2, #T_0502a_row141_col3, #T_0502a_row141_col4, #T_0502a_row141_col5, #T_0502a_row141_col6, #T_0502a_row141_col7, #T_0502a_row141_col8, #T_0502a_row142_col0, #T_0502a_row142_col1, #T_0502a_row142_col2, #T_0502a_row142_col3, #T_0502a_row142_col4, #T_0502a_row142_col5, #T_0502a_row142_col6, #T_0502a_row142_col7, #T_0502a_row142_col8, #T_0502a_row143_col0, #T_0502a_row143_col1, #T_0502a_row143_col2, #T_0502a_row143_col3, #T_0502a_row143_col4, #T_0502a_row143_col5, #T_0502a_row143_col6, #T_0502a_row143_col7, #T_0502a_row143_col8, #T_0502a_row144_col0, #T_0502a_row144_col1, #T_0502a_row144_col2, #T_0502a_row144_col3, #T_0502a_row144_col4, #T_0502a_row144_col5, #T_0502a_row144_col6, #T_0502a_row144_col7, #T_0502a_row144_col8, #T_0502a_row145_col0, #T_0502a_row145_col1, #T_0502a_row145_col2, #T_0502a_row145_col3, #T_0502a_row145_col4, #T_0502a_row145_col5, #T_0502a_row145_col6, #T_0502a_row145_col7, #T_0502a_row145_col8, #T_0502a_row146_col0, #T_0502a_row146_col1, #T_0502a_row146_col2, #T_0502a_row146_col3, #T_0502a_row146_col4, #T_0502a_row146_col5, #T_0502a_row146_col6, #T_0502a_row146_col7, #T_0502a_row146_col8, #T_0502a_row147_col0, #T_0502a_row147_col1, #T_0502a_row147_col2, #T_0502a_row147_col3, #T_0502a_row147_col4, #T_0502a_row147_col5, #T_0502a_row147_col6, #T_0502a_row147_col7, #T_0502a_row147_col8, #T_0502a_row148_col0, #T_0502a_row148_col1, #T_0502a_row148_col2, #T_0502a_row148_col3, #T_0502a_row148_col4, #T_0502a_row148_col5, #T_0502a_row148_col6, #T_0502a_row148_col7, #T_0502a_row148_col8, #T_0502a_row149_col0, #T_0502a_row149_col1, #T_0502a_row149_col2, #T_0502a_row149_col3, #T_0502a_row149_col4, #T_0502a_row149_col5, #T_0502a_row149_col6, #T_0502a_row149_col7, #T_0502a_row149_col8, #T_0502a_row150_col0, #T_0502a_row150_col1, #T_0502a_row150_col2, #T_0502a_row150_col3, #T_0502a_row150_col4, #T_0502a_row150_col5, #T_0502a_row150_col6, #T_0502a_row150_col7, #T_0502a_row150_col8, #T_0502a_row151_col0, #T_0502a_row151_col1, #T_0502a_row151_col2, #T_0502a_row151_col3, #T_0502a_row151_col4, #T_0502a_row151_col5, #T_0502a_row151_col6, #T_0502a_row151_col7, #T_0502a_row151_col8, #T_0502a_row152_col0, #T_0502a_row152_col1, #T_0502a_row152_col2, #T_0502a_row152_col3, #T_0502a_row152_col4, #T_0502a_row152_col5, #T_0502a_row152_col6, #T_0502a_row152_col7, #T_0502a_row152_col8, #T_0502a_row153_col0, #T_0502a_row153_col1, #T_0502a_row153_col2, #T_0502a_row153_col3, #T_0502a_row153_col4, #T_0502a_row153_col5, #T_0502a_row153_col6, #T_0502a_row153_col7, #T_0502a_row153_col8, #T_0502a_row154_col0, #T_0502a_row154_col1, #T_0502a_row154_col2, #T_0502a_row154_col3, #T_0502a_row154_col4, #T_0502a_row154_col5, #T_0502a_row154_col6, #T_0502a_row154_col7, #T_0502a_row154_col8, #T_0502a_row155_col0, #T_0502a_row155_col1, #T_0502a_row155_col2, #T_0502a_row155_col3, #T_0502a_row155_col4, #T_0502a_row155_col5, #T_0502a_row155_col6, #T_0502a_row155_col7, #T_0502a_row155_col8, #T_0502a_row156_col0, #T_0502a_row156_col1, #T_0502a_row156_col2, #T_0502a_row156_col3, #T_0502a_row156_col4, #T_0502a_row156_col5, #T_0502a_row156_col6, #T_0502a_row156_col7, #T_0502a_row156_col8, #T_0502a_row157_col0, #T_0502a_row157_col1, #T_0502a_row157_col2, #T_0502a_row157_col3, #T_0502a_row157_col4, #T_0502a_row157_col5, #T_0502a_row157_col6, #T_0502a_row157_col7, #T_0502a_row157_col8, #T_0502a_row158_col0, #T_0502a_row158_col1, #T_0502a_row158_col2, #T_0502a_row158_col3, #T_0502a_row158_col4, #T_0502a_row158_col5, #T_0502a_row158_col6, #T_0502a_row158_col7, #T_0502a_row158_col8, #T_0502a_row159_col0, #T_0502a_row159_col1, #T_0502a_row159_col2, #T_0502a_row159_col3, #T_0502a_row159_col4, #T_0502a_row159_col5, #T_0502a_row159_col6, #T_0502a_row159_col7, #T_0502a_row159_col8, #T_0502a_row160_col0, #T_0502a_row160_col1, #T_0502a_row160_col2, #T_0502a_row160_col3, #T_0502a_row160_col4, #T_0502a_row160_col5, #T_0502a_row160_col6, #T_0502a_row160_col7, #T_0502a_row160_col8, #T_0502a_row161_col0, #T_0502a_row161_col1, #T_0502a_row161_col2, #T_0502a_row161_col3, #T_0502a_row161_col4, #T_0502a_row161_col5, #T_0502a_row161_col6, #T_0502a_row161_col7, #T_0502a_row161_col8, #T_0502a_row162_col0, #T_0502a_row162_col1, #T_0502a_row162_col2, #T_0502a_row162_col3, #T_0502a_row162_col4, #T_0502a_row162_col5, #T_0502a_row162_col6, #T_0502a_row162_col7, #T_0502a_row162_col8, #T_0502a_row163_col0, #T_0502a_row163_col1, #T_0502a_row163_col2, #T_0502a_row163_col3, #T_0502a_row163_col4, #T_0502a_row163_col5, #T_0502a_row163_col6, #T_0502a_row163_col7, #T_0502a_row163_col8, #T_0502a_row164_col0, #T_0502a_row164_col1, #T_0502a_row164_col2, #T_0502a_row164_col3, #T_0502a_row164_col4, #T_0502a_row164_col5, #T_0502a_row164_col6, #T_0502a_row164_col7, #T_0502a_row164_col8, #T_0502a_row165_col0, #T_0502a_row165_col1, #T_0502a_row165_col2, #T_0502a_row165_col3, #T_0502a_row165_col4, #T_0502a_row165_col5, #T_0502a_row165_col6, #T_0502a_row165_col7, #T_0502a_row165_col8, #T_0502a_row166_col0, #T_0502a_row166_col1, #T_0502a_row166_col2, #T_0502a_row166_col3, #T_0502a_row166_col4, #T_0502a_row166_col5, #T_0502a_row166_col6, #T_0502a_row166_col7, #T_0502a_row166_col8, #T_0502a_row167_col0, #T_0502a_row167_col1, #T_0502a_row167_col2, #T_0502a_row167_col3, #T_0502a_row167_col4, #T_0502a_row167_col5, #T_0502a_row167_col6, #T_0502a_row167_col7, #T_0502a_row167_col8, #T_0502a_row168_col0, #T_0502a_row168_col1, #T_0502a_row168_col2, #T_0502a_row168_col3, #T_0502a_row168_col4, #T_0502a_row168_col5, #T_0502a_row168_col6, #T_0502a_row168_col7, #T_0502a_row168_col8, #T_0502a_row169_col0, #T_0502a_row169_col1, #T_0502a_row169_col2, #T_0502a_row169_col3, #T_0502a_row169_col4, #T_0502a_row169_col5, #T_0502a_row169_col6, #T_0502a_row169_col7, #T_0502a_row169_col8, #T_0502a_row170_col0, #T_0502a_row170_col1, #T_0502a_row170_col2, #T_0502a_row170_col3, #T_0502a_row170_col4, #T_0502a_row170_col5, #T_0502a_row170_col6, #T_0502a_row170_col7, #T_0502a_row170_col8, #T_0502a_row171_col0, #T_0502a_row171_col1, #T_0502a_row171_col2, #T_0502a_row171_col3, #T_0502a_row171_col4, #T_0502a_row171_col5, #T_0502a_row171_col6, #T_0502a_row171_col7, #T_0502a_row171_col8, #T_0502a_row172_col0, #T_0502a_row172_col1, #T_0502a_row172_col2, #T_0502a_row172_col3, #T_0502a_row172_col4, #T_0502a_row172_col5, #T_0502a_row172_col6, #T_0502a_row172_col7, #T_0502a_row172_col8, #T_0502a_row173_col0, #T_0502a_row173_col1, #T_0502a_row173_col2, #T_0502a_row173_col3, #T_0502a_row173_col4, #T_0502a_row173_col5, #T_0502a_row173_col6, #T_0502a_row173_col7, #T_0502a_row173_col8, #T_0502a_row174_col0, #T_0502a_row174_col1, #T_0502a_row174_col2, #T_0502a_row174_col3, #T_0502a_row174_col4, #T_0502a_row174_col5, #T_0502a_row174_col6, #T_0502a_row174_col7, #T_0502a_row174_col8, #T_0502a_row175_col0, #T_0502a_row175_col1, #T_0502a_row175_col2, #T_0502a_row175_col3, #T_0502a_row175_col4, #T_0502a_row175_col5, #T_0502a_row175_col6, #T_0502a_row175_col7, #T_0502a_row175_col8, #T_0502a_row176_col0, #T_0502a_row176_col1, #T_0502a_row176_col2, #T_0502a_row176_col3, #T_0502a_row176_col4, #T_0502a_row176_col5, #T_0502a_row176_col6, #T_0502a_row176_col7, #T_0502a_row176_col8, #T_0502a_row177_col0, #T_0502a_row177_col1, #T_0502a_row177_col2, #T_0502a_row177_col3, #T_0502a_row177_col4, #T_0502a_row177_col5, #T_0502a_row177_col6, #T_0502a_row177_col7, #T_0502a_row177_col8, #T_0502a_row178_col0, #T_0502a_row178_col1, #T_0502a_row178_col2, #T_0502a_row178_col3, #T_0502a_row178_col4, #T_0502a_row178_col5, #T_0502a_row178_col6, #T_0502a_row178_col7, #T_0502a_row178_col8, #T_0502a_row179_col0, #T_0502a_row179_col1, #T_0502a_row179_col2, #T_0502a_row179_col3, #T_0502a_row179_col4, #T_0502a_row179_col5, #T_0502a_row179_col6, #T_0502a_row179_col7, #T_0502a_row179_col8, #T_0502a_row180_col0, #T_0502a_row180_col1, #T_0502a_row180_col2, #T_0502a_row180_col3, #T_0502a_row180_col4, #T_0502a_row180_col5, #T_0502a_row180_col6, #T_0502a_row180_col7, #T_0502a_row180_col8, #T_0502a_row181_col0, #T_0502a_row181_col1, #T_0502a_row181_col2, #T_0502a_row181_col3, #T_0502a_row181_col4, #T_0502a_row181_col5, #T_0502a_row181_col6, #T_0502a_row181_col7, #T_0502a_row181_col8, #T_0502a_row182_col0, #T_0502a_row182_col1, #T_0502a_row182_col2, #T_0502a_row182_col3, #T_0502a_row182_col4, #T_0502a_row182_col5, #T_0502a_row182_col6, #T_0502a_row182_col7, #T_0502a_row182_col8, #T_0502a_row183_col0, #T_0502a_row183_col1, #T_0502a_row183_col2, #T_0502a_row183_col3, #T_0502a_row183_col4, #T_0502a_row183_col5, #T_0502a_row183_col6, #T_0502a_row183_col7, #T_0502a_row183_col8, #T_0502a_row184_col0, #T_0502a_row184_col1, #T_0502a_row184_col2, #T_0502a_row184_col3, #T_0502a_row184_col4, #T_0502a_row184_col5, #T_0502a_row184_col6, #T_0502a_row184_col7, #T_0502a_row184_col8, #T_0502a_row185_col0, #T_0502a_row185_col1, #T_0502a_row185_col2, #T_0502a_row185_col3, #T_0502a_row185_col4, #T_0502a_row185_col5, #T_0502a_row185_col6, #T_0502a_row185_col7, #T_0502a_row185_col8, #T_0502a_row186_col0, #T_0502a_row186_col1, #T_0502a_row186_col2, #T_0502a_row186_col3, #T_0502a_row186_col4, #T_0502a_row186_col5, #T_0502a_row186_col6, #T_0502a_row186_col7, #T_0502a_row186_col8, #T_0502a_row187_col0, #T_0502a_row187_col1, #T_0502a_row187_col2, #T_0502a_row187_col3, #T_0502a_row187_col4, #T_0502a_row187_col5, #T_0502a_row187_col6, #T_0502a_row187_col7, #T_0502a_row187_col8, #T_0502a_row188_col0, #T_0502a_row188_col1, #T_0502a_row188_col2, #T_0502a_row188_col3, #T_0502a_row188_col4, #T_0502a_row188_col5, #T_0502a_row188_col6, #T_0502a_row188_col7, #T_0502a_row188_col8, #T_0502a_row189_col0, #T_0502a_row189_col1, #T_0502a_row189_col2, #T_0502a_row189_col3, #T_0502a_row189_col4, #T_0502a_row189_col5, #T_0502a_row189_col6, #T_0502a_row189_col7, #T_0502a_row189_col8, #T_0502a_row190_col0, #T_0502a_row190_col1, #T_0502a_row190_col2, #T_0502a_row190_col3, #T_0502a_row190_col4, #T_0502a_row190_col5, #T_0502a_row190_col6, #T_0502a_row190_col7, #T_0502a_row190_col8, #T_0502a_row191_col0, #T_0502a_row191_col1, #T_0502a_row191_col2, #T_0502a_row191_col3, #T_0502a_row191_col4, #T_0502a_row191_col5, #T_0502a_row191_col6, #T_0502a_row191_col7, #T_0502a_row191_col8, #T_0502a_row192_col0, #T_0502a_row192_col1, #T_0502a_row192_col2, #T_0502a_row192_col3, #T_0502a_row192_col4, #T_0502a_row192_col5, #T_0502a_row192_col6, #T_0502a_row192_col7, #T_0502a_row192_col8, #T_0502a_row193_col0, #T_0502a_row193_col1, #T_0502a_row193_col2, #T_0502a_row193_col3, #T_0502a_row193_col4, #T_0502a_row193_col5, #T_0502a_row193_col6, #T_0502a_row193_col7, #T_0502a_row193_col8, #T_0502a_row194_col0, #T_0502a_row194_col1, #T_0502a_row194_col2, #T_0502a_row194_col3, #T_0502a_row194_col4, #T_0502a_row194_col5, #T_0502a_row194_col6, #T_0502a_row194_col7, #T_0502a_row194_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_0502a\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_0502a_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_0502a_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_0502a_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_0502a_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_0502a_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_0502a_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_0502a_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_0502a_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_0502a_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_0502a_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.ACFandPACFPlot</td>\n", - " <td id=\"T_0502a_row0_col1\" class=\"data row0 col1\" >AC Fand PACF Plot</td>\n", - " <td id=\"T_0502a_row0_col2\" class=\"data row0 col2\" >Analyzes time series data using Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to...</td>\n", - " <td id=\"T_0502a_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_0502a_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_0502a_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_0502a_row0_col7\" class=\"data row0 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'visualization']</td>\n", - " <td id=\"T_0502a_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ADF</td>\n", - " <td id=\"T_0502a_row1_col1\" class=\"data row1 col1\" >ADF</td>\n", - " <td id=\"T_0502a_row1_col2\" class=\"data row1 col2\" >Assesses the stationarity of a time series dataset using the Augmented Dickey-Fuller (ADF) test....</td>\n", - " <td id=\"T_0502a_row1_col3\" class=\"data row1 col3\" >False</td>\n", - " <td id=\"T_0502a_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_0502a_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row1_col6\" class=\"data row1 col6\" >{}</td>\n", - " <td id=\"T_0502a_row1_col7\" class=\"data row1 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test', 'stationarity']</td>\n", - " <td id=\"T_0502a_row1_col8\" class=\"data row1 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.AutoAR</td>\n", - " <td id=\"T_0502a_row2_col1\" class=\"data row2 col1\" >Auto AR</td>\n", - " <td id=\"T_0502a_row2_col2\" class=\"data row2 col2\" >Automatically identifies the optimal Autoregressive (AR) order for a time series using BIC and AIC criteria....</td>\n", - " <td id=\"T_0502a_row2_col3\" class=\"data row2 col3\" >False</td>\n", - " <td id=\"T_0502a_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_0502a_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row2_col6\" class=\"data row2 col6\" >{'max_ar_order': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row2_col7\" class=\"data row2 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row2_col8\" class=\"data row2 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.AutoMA</td>\n", - " <td id=\"T_0502a_row3_col1\" class=\"data row3 col1\" >Auto MA</td>\n", - " <td id=\"T_0502a_row3_col2\" class=\"data row3 col2\" >Automatically selects the optimal Moving Average (MA) order for each variable in a time series dataset based on...</td>\n", - " <td id=\"T_0502a_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_0502a_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_0502a_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row3_col6\" class=\"data row3 col6\" >{'max_ma_order': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row3_col7\" class=\"data row3 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row3_col8\" class=\"data row3 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.AutoStationarity</td>\n", - " <td id=\"T_0502a_row4_col1\" class=\"data row4 col1\" >Auto Stationarity</td>\n", - " <td id=\"T_0502a_row4_col2\" class=\"data row4 col2\" >Automates Augmented Dickey-Fuller test to assess stationarity across multiple time series in a DataFrame....</td>\n", - " <td id=\"T_0502a_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_0502a_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_0502a_row4_col5\" class=\"data row4 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row4_col6\" class=\"data row4 col6\" >{'max_order': {'type': 'int', 'default': 5}, 'threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row4_col7\" class=\"data row4 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row4_col8\" class=\"data row4 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", - " <td id=\"T_0502a_row5_col1\" class=\"data row5 col1\" >Bivariate Scatter Plots</td>\n", - " <td id=\"T_0502a_row5_col2\" class=\"data row5 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", - " <td id=\"T_0502a_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_0502a_row5_col4\" class=\"data row5 col4\" >False</td>\n", - " <td id=\"T_0502a_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_0502a_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row5_col8\" class=\"data row5 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.BoxPierce</td>\n", - " <td id=\"T_0502a_row6_col1\" class=\"data row6 col1\" >Box Pierce</td>\n", - " <td id=\"T_0502a_row6_col2\" class=\"data row6 col2\" >Detects autocorrelation in time-series data through the Box-Pierce test to validate model performance....</td>\n", - " <td id=\"T_0502a_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_0502a_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_0502a_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row6_col6\" class=\"data row6 col6\" >{}</td>\n", - " <td id=\"T_0502a_row6_col7\" class=\"data row6 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row6_col8\" class=\"data row6 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", - " <td id=\"T_0502a_row7_col1\" class=\"data row7 col1\" >Chi Squared Features Table</td>\n", - " <td id=\"T_0502a_row7_col2\" class=\"data row7 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", - " <td id=\"T_0502a_row7_col3\" class=\"data row7 col3\" >False</td>\n", - " <td id=\"T_0502a_row7_col4\" class=\"data row7 col4\" >True</td>\n", - " <td id=\"T_0502a_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row7_col6\" class=\"data row7 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row7_col8\" class=\"data row7 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_0502a_row8_col1\" class=\"data row8 col1\" >Class Imbalance</td>\n", - " <td id=\"T_0502a_row8_col2\" class=\"data row8 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", - " <td id=\"T_0502a_row8_col3\" class=\"data row8 col3\" >True</td>\n", - " <td id=\"T_0502a_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_0502a_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row8_col6\" class=\"data row8 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", - " <td id=\"T_0502a_row8_col8\" class=\"data row8 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_0502a_row9_col1\" class=\"data row9 col1\" >Dataset Description</td>\n", - " <td id=\"T_0502a_row9_col2\" class=\"data row9 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", - " <td id=\"T_0502a_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_0502a_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_0502a_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row9_col6\" class=\"data row9 col6\" >{}</td>\n", - " <td id=\"T_0502a_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_0502a_row9_col8\" class=\"data row9 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_0502a_row10_col1\" class=\"data row10 col1\" >Dataset Split</td>\n", - " <td id=\"T_0502a_row10_col2\" class=\"data row10 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", - " <td id=\"T_0502a_row10_col3\" class=\"data row10 col3\" >False</td>\n", - " <td id=\"T_0502a_row10_col4\" class=\"data row10 col4\" >True</td>\n", - " <td id=\"T_0502a_row10_col5\" class=\"data row10 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row10_col6\" class=\"data row10 col6\" >{}</td>\n", - " <td id=\"T_0502a_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_0502a_row10_col8\" class=\"data row10 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_0502a_row11_col1\" class=\"data row11 col1\" >Descriptive Statistics</td>\n", - " <td id=\"T_0502a_row11_col2\" class=\"data row11 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", - " <td id=\"T_0502a_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_0502a_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_0502a_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row11_col6\" class=\"data row11 col6\" >{}</td>\n", - " <td id=\"T_0502a_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", - " <td id=\"T_0502a_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.DickeyFullerGLS</td>\n", - " <td id=\"T_0502a_row12_col1\" class=\"data row12 col1\" >Dickey Fuller GLS</td>\n", - " <td id=\"T_0502a_row12_col2\" class=\"data row12 col2\" >Assesses stationarity in time series data using the Dickey-Fuller GLS test to determine the order of integration....</td>\n", - " <td id=\"T_0502a_row12_col3\" class=\"data row12 col3\" >False</td>\n", - " <td id=\"T_0502a_row12_col4\" class=\"data row12 col4\" >True</td>\n", - " <td id=\"T_0502a_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row12_col6\" class=\"data row12 col6\" >{}</td>\n", - " <td id=\"T_0502a_row12_col7\" class=\"data row12 col7\" >['time_series_data', 'forecasting', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row12_col8\" class=\"data row12 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_0502a_row13_col1\" class=\"data row13 col1\" >Duplicates</td>\n", - " <td id=\"T_0502a_row13_col2\" class=\"data row13 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", - " <td id=\"T_0502a_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_0502a_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_0502a_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row13_col6\" class=\"data row13 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", - " <td id=\"T_0502a_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.EngleGrangerCoint</td>\n", - " <td id=\"T_0502a_row14_col1\" class=\"data row14 col1\" >Engle Granger Coint</td>\n", - " <td id=\"T_0502a_row14_col2\" class=\"data row14 col2\" >Assesses the degree of co-movement between pairs of time series data using the Engle-Granger cointegration test....</td>\n", - " <td id=\"T_0502a_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_0502a_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_0502a_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row14_col6\" class=\"data row14 col6\" >{'threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row14_col7\" class=\"data row14 col7\" >['time_series_data', 'statistical_test', 'forecasting']</td>\n", - " <td id=\"T_0502a_row14_col8\" class=\"data row14 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", - " <td id=\"T_0502a_row15_col1\" class=\"data row15 col1\" >Feature Target Correlation Plot</td>\n", - " <td id=\"T_0502a_row15_col2\" class=\"data row15 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", - " <td id=\"T_0502a_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_0502a_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_0502a_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row15_col6\" class=\"data row15 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", - " <td id=\"T_0502a_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", - " <td id=\"T_0502a_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_0502a_row16_col1\" class=\"data row16 col1\" >High Cardinality</td>\n", - " <td id=\"T_0502a_row16_col2\" class=\"data row16 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", - " <td id=\"T_0502a_row16_col3\" class=\"data row16 col3\" >False</td>\n", - " <td id=\"T_0502a_row16_col4\" class=\"data row16 col4\" >True</td>\n", - " <td id=\"T_0502a_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row16_col6\" class=\"data row16 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", - " <td id=\"T_0502a_row16_col7\" class=\"data row16 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_0502a_row17_col1\" class=\"data row17 col1\" >High Pearson Correlation</td>\n", - " <td id=\"T_0502a_row17_col2\" class=\"data row17 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", - " <td id=\"T_0502a_row17_col3\" class=\"data row17 col3\" >False</td>\n", - " <td id=\"T_0502a_row17_col4\" class=\"data row17 col4\" >True</td>\n", - " <td id=\"T_0502a_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row17_col6\" class=\"data row17 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", - " <td id=\"T_0502a_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", - " <td id=\"T_0502a_row18_col1\" class=\"data row18 col1\" >IQR Outliers Bar Plot</td>\n", - " <td id=\"T_0502a_row18_col2\" class=\"data row18 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", - " <td id=\"T_0502a_row18_col3\" class=\"data row18 col3\" >True</td>\n", - " <td id=\"T_0502a_row18_col4\" class=\"data row18 col4\" >False</td>\n", - " <td id=\"T_0502a_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row18_col6\" class=\"data row18 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", - " <td id=\"T_0502a_row18_col7\" class=\"data row18 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", - " <td id=\"T_0502a_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.IQROutliersTable</td>\n", - " <td id=\"T_0502a_row19_col1\" class=\"data row19 col1\" >IQR Outliers Table</td>\n", - " <td id=\"T_0502a_row19_col2\" class=\"data row19 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", - " <td id=\"T_0502a_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_0502a_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_0502a_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row19_col6\" class=\"data row19 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", - " <td id=\"T_0502a_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'numerical_data']</td>\n", - " <td id=\"T_0502a_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", - " <td id=\"T_0502a_row20_col1\" class=\"data row20 col1\" >Isolation Forest Outliers</td>\n", - " <td id=\"T_0502a_row20_col2\" class=\"data row20 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", - " <td id=\"T_0502a_row20_col3\" class=\"data row20 col3\" >True</td>\n", - " <td id=\"T_0502a_row20_col4\" class=\"data row20 col4\" >False</td>\n", - " <td id=\"T_0502a_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row20_col6\" class=\"data row20 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'anomaly_detection']</td>\n", - " <td id=\"T_0502a_row20_col8\" class=\"data row20 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.JarqueBera</td>\n", - " <td id=\"T_0502a_row21_col1\" class=\"data row21 col1\" >Jarque Bera</td>\n", - " <td id=\"T_0502a_row21_col2\" class=\"data row21 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", - " <td id=\"T_0502a_row21_col3\" class=\"data row21 col3\" >False</td>\n", - " <td id=\"T_0502a_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_0502a_row21_col5\" class=\"data row21 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row21_col6\" class=\"data row21 col6\" >{}</td>\n", - " <td id=\"T_0502a_row21_col7\" class=\"data row21 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.KPSS</td>\n", - " <td id=\"T_0502a_row22_col1\" class=\"data row22 col1\" >KPSS</td>\n", - " <td id=\"T_0502a_row22_col2\" class=\"data row22 col2\" >Assesses the stationarity of time-series data in a machine learning model using the KPSS unit root test....</td>\n", - " <td id=\"T_0502a_row22_col3\" class=\"data row22 col3\" >False</td>\n", - " <td id=\"T_0502a_row22_col4\" class=\"data row22 col4\" >True</td>\n", - " <td id=\"T_0502a_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row22_col6\" class=\"data row22 col6\" >{}</td>\n", - " <td id=\"T_0502a_row22_col7\" class=\"data row22 col7\" >['time_series_data', 'stationarity', 'unit_root_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row22_col8\" class=\"data row22 col8\" >['data_validation']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.LJungBox</td>\n", - " <td id=\"T_0502a_row23_col1\" class=\"data row23 col1\" >L Jung Box</td>\n", - " <td id=\"T_0502a_row23_col2\" class=\"data row23 col2\" >Assesses autocorrelations in dataset features by performing a Ljung-Box test on each feature....</td>\n", - " <td id=\"T_0502a_row23_col3\" class=\"data row23 col3\" >False</td>\n", - " <td id=\"T_0502a_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_0502a_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row23_col6\" class=\"data row23 col6\" >{}</td>\n", - " <td id=\"T_0502a_row23_col7\" class=\"data row23 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row23_col8\" class=\"data row23 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.LaggedCorrelationHeatmap</td>\n", - " <td id=\"T_0502a_row24_col1\" class=\"data row24 col1\" >Lagged Correlation Heatmap</td>\n", - " <td id=\"T_0502a_row24_col2\" class=\"data row24 col2\" >Assesses and visualizes correlation between target variable and lagged independent variables in a time-series...</td>\n", - " <td id=\"T_0502a_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_0502a_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_0502a_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row24_col6\" class=\"data row24 col6\" >{'num_lags': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row24_col7\" class=\"data row24 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row24_col8\" class=\"data row24 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_0502a_row25_col1\" class=\"data row25 col1\" >Missing Values</td>\n", - " <td id=\"T_0502a_row25_col2\" class=\"data row25 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", - " <td id=\"T_0502a_row25_col3\" class=\"data row25 col3\" >False</td>\n", - " <td id=\"T_0502a_row25_col4\" class=\"data row25 col4\" >True</td>\n", - " <td id=\"T_0502a_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row25_col6\" class=\"data row25 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row25_col7\" class=\"data row25 col7\" >['tabular_data', 'data_quality']</td>\n", - " <td id=\"T_0502a_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", - " <td id=\"T_0502a_row26_col1\" class=\"data row26 col1\" >Missing Values Bar Plot</td>\n", - " <td id=\"T_0502a_row26_col2\" class=\"data row26 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", - " <td id=\"T_0502a_row26_col3\" class=\"data row26 col3\" >True</td>\n", - " <td id=\"T_0502a_row26_col4\" class=\"data row26 col4\" >False</td>\n", - " <td id=\"T_0502a_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row26_col6\" class=\"data row26 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", - " <td id=\"T_0502a_row26_col7\" class=\"data row26 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", - " <td id=\"T_0502a_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.MutualInformation</td>\n", - " <td id=\"T_0502a_row27_col1\" class=\"data row27 col1\" >Mutual Information</td>\n", - " <td id=\"T_0502a_row27_col2\" class=\"data row27 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", - " <td id=\"T_0502a_row27_col3\" class=\"data row27 col3\" >True</td>\n", - " <td id=\"T_0502a_row27_col4\" class=\"data row27 col4\" >False</td>\n", - " <td id=\"T_0502a_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row27_col6\" class=\"data row27 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", - " <td id=\"T_0502a_row27_col7\" class=\"data row27 col7\" >['feature_selection', 'data_analysis']</td>\n", - " <td id=\"T_0502a_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_0502a_row28_col1\" class=\"data row28 col1\" >Pearson Correlation Matrix</td>\n", - " <td id=\"T_0502a_row28_col2\" class=\"data row28 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", - " <td id=\"T_0502a_row28_col3\" class=\"data row28 col3\" >True</td>\n", - " <td id=\"T_0502a_row28_col4\" class=\"data row28 col4\" >False</td>\n", - " <td id=\"T_0502a_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_0502a_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", - " <td id=\"T_0502a_row28_col8\" class=\"data row28 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.PhillipsPerronArch</td>\n", - " <td id=\"T_0502a_row29_col1\" class=\"data row29 col1\" >Phillips Perron Arch</td>\n", - " <td id=\"T_0502a_row29_col2\" class=\"data row29 col2\" >Assesses the stationarity of time series data in each feature of the ML model using the Phillips-Perron test....</td>\n", - " <td id=\"T_0502a_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_0502a_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_0502a_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row29_col6\" class=\"data row29 col6\" >{}</td>\n", - " <td id=\"T_0502a_row29_col7\" class=\"data row29 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", - " <td id=\"T_0502a_row30_col1\" class=\"data row30 col1\" >Protected Classes Description</td>\n", - " <td id=\"T_0502a_row30_col2\" class=\"data row30 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", - " <td id=\"T_0502a_row30_col3\" class=\"data row30 col3\" >True</td>\n", - " <td id=\"T_0502a_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_0502a_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row30_col6\" class=\"data row30 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row30_col7\" class=\"data row30 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", - " <td id=\"T_0502a_row30_col8\" class=\"data row30 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.RollingStatsPlot</td>\n", - " <td id=\"T_0502a_row31_col1\" class=\"data row31 col1\" >Rolling Stats Plot</td>\n", - " <td id=\"T_0502a_row31_col2\" class=\"data row31 col2\" >Evaluates the stationarity of time series data by plotting its rolling mean and standard deviation over a specified...</td>\n", - " <td id=\"T_0502a_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_0502a_row31_col4\" class=\"data row31 col4\" >False</td>\n", - " <td id=\"T_0502a_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row31_col6\" class=\"data row31 col6\" >{'window_size': {'type': 'int', 'default': 12}}</td>\n", - " <td id=\"T_0502a_row31_col7\" class=\"data row31 col7\" >['time_series_data', 'visualization', 'stationarity']</td>\n", - " <td id=\"T_0502a_row31_col8\" class=\"data row31 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.RunsTest</td>\n", - " <td id=\"T_0502a_row32_col1\" class=\"data row32 col1\" >Runs Test</td>\n", - " <td id=\"T_0502a_row32_col2\" class=\"data row32 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", - " <td id=\"T_0502a_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_0502a_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_0502a_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row32_col6\" class=\"data row32 col6\" >{}</td>\n", - " <td id=\"T_0502a_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row32_col8\" class=\"data row32 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row33_col0\" class=\"data row33 col0\" >validmind.data_validation.ScatterPlot</td>\n", - " <td id=\"T_0502a_row33_col1\" class=\"data row33 col1\" >Scatter Plot</td>\n", - " <td id=\"T_0502a_row33_col2\" class=\"data row33 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", - " <td id=\"T_0502a_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_0502a_row33_col4\" class=\"data row33 col4\" >False</td>\n", - " <td id=\"T_0502a_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row33_col6\" class=\"data row33 col6\" >{}</td>\n", - " <td id=\"T_0502a_row33_col7\" class=\"data row33 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row33_col8\" class=\"data row33 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row34_col0\" class=\"data row34 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", - " <td id=\"T_0502a_row34_col1\" class=\"data row34 col1\" >Score Band Default Rates</td>\n", - " <td id=\"T_0502a_row34_col2\" class=\"data row34 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", - " <td id=\"T_0502a_row34_col3\" class=\"data row34 col3\" >False</td>\n", - " <td id=\"T_0502a_row34_col4\" class=\"data row34 col4\" >True</td>\n", - " <td id=\"T_0502a_row34_col5\" class=\"data row34 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row34_col6\" class=\"data row34 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row34_col7\" class=\"data row34 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_0502a_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row35_col0\" class=\"data row35 col0\" >validmind.data_validation.SeasonalDecompose</td>\n", - " <td id=\"T_0502a_row35_col1\" class=\"data row35 col1\" >Seasonal Decompose</td>\n", - " <td id=\"T_0502a_row35_col2\" class=\"data row35 col2\" >Assesses patterns and seasonality in a time series dataset by decomposing its features into foundational components....</td>\n", - " <td id=\"T_0502a_row35_col3\" class=\"data row35 col3\" >True</td>\n", - " <td id=\"T_0502a_row35_col4\" class=\"data row35 col4\" >False</td>\n", - " <td id=\"T_0502a_row35_col5\" class=\"data row35 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row35_col6\" class=\"data row35 col6\" >{'seasonal_model': {'type': 'str', 'default': 'additive'}}</td>\n", - " <td id=\"T_0502a_row35_col7\" class=\"data row35 col7\" >['time_series_data', 'seasonality', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row35_col8\" class=\"data row35 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row36_col0\" class=\"data row36 col0\" >validmind.data_validation.ShapiroWilk</td>\n", - " <td id=\"T_0502a_row36_col1\" class=\"data row36 col1\" >Shapiro Wilk</td>\n", - " <td id=\"T_0502a_row36_col2\" class=\"data row36 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", - " <td id=\"T_0502a_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_0502a_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_0502a_row36_col5\" class=\"data row36 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row36_col6\" class=\"data row36 col6\" >{}</td>\n", - " <td id=\"T_0502a_row36_col7\" class=\"data row36 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row36_col8\" class=\"data row36 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row37_col0\" class=\"data row37 col0\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_0502a_row37_col1\" class=\"data row37 col1\" >Skewness</td>\n", - " <td id=\"T_0502a_row37_col2\" class=\"data row37 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", - " <td id=\"T_0502a_row37_col3\" class=\"data row37 col3\" >False</td>\n", - " <td id=\"T_0502a_row37_col4\" class=\"data row37 col4\" >True</td>\n", - " <td id=\"T_0502a_row37_col5\" class=\"data row37 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row37_col6\" class=\"data row37 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row37_col7\" class=\"data row37 col7\" >['data_quality', 'tabular_data']</td>\n", - " <td id=\"T_0502a_row37_col8\" class=\"data row37 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row38_col0\" class=\"data row38 col0\" >validmind.data_validation.SpreadPlot</td>\n", - " <td id=\"T_0502a_row38_col1\" class=\"data row38 col1\" >Spread Plot</td>\n", - " <td id=\"T_0502a_row38_col2\" class=\"data row38 col2\" >Assesses potential correlations between pairs of time series variables through visualization to enhance...</td>\n", - " <td id=\"T_0502a_row38_col3\" class=\"data row38 col3\" >True</td>\n", - " <td id=\"T_0502a_row38_col4\" class=\"data row38 col4\" >False</td>\n", - " <td id=\"T_0502a_row38_col5\" class=\"data row38 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row38_col6\" class=\"data row38 col6\" >{}</td>\n", - " <td id=\"T_0502a_row38_col7\" class=\"data row38 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row38_col8\" class=\"data row38 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row39_col0\" class=\"data row39 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", - " <td id=\"T_0502a_row39_col1\" class=\"data row39 col1\" >Tabular Categorical Bar Plots</td>\n", - " <td id=\"T_0502a_row39_col2\" class=\"data row39 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", - " <td id=\"T_0502a_row39_col3\" class=\"data row39 col3\" >True</td>\n", - " <td id=\"T_0502a_row39_col4\" class=\"data row39 col4\" >False</td>\n", - " <td id=\"T_0502a_row39_col5\" class=\"data row39 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row39_col6\" class=\"data row39 col6\" >{}</td>\n", - " <td id=\"T_0502a_row39_col7\" class=\"data row39 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row39_col8\" class=\"data row39 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row40_col0\" class=\"data row40 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", - " <td id=\"T_0502a_row40_col1\" class=\"data row40 col1\" >Tabular Date Time Histograms</td>\n", - " <td id=\"T_0502a_row40_col2\" class=\"data row40 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", - " <td id=\"T_0502a_row40_col3\" class=\"data row40 col3\" >True</td>\n", - " <td id=\"T_0502a_row40_col4\" class=\"data row40 col4\" >False</td>\n", - " <td id=\"T_0502a_row40_col5\" class=\"data row40 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row40_col6\" class=\"data row40 col6\" >{}</td>\n", - " <td id=\"T_0502a_row40_col7\" class=\"data row40 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row40_col8\" class=\"data row40 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row41_col0\" class=\"data row41 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", - " <td id=\"T_0502a_row41_col1\" class=\"data row41 col1\" >Tabular Description Tables</td>\n", - " <td id=\"T_0502a_row41_col2\" class=\"data row41 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", - " <td id=\"T_0502a_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_0502a_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_0502a_row41_col5\" class=\"data row41 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row41_col6\" class=\"data row41 col6\" >{}</td>\n", - " <td id=\"T_0502a_row41_col7\" class=\"data row41 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row41_col8\" class=\"data row41 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row42_col0\" class=\"data row42 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", - " <td id=\"T_0502a_row42_col1\" class=\"data row42 col1\" >Tabular Numerical Histograms</td>\n", - " <td id=\"T_0502a_row42_col2\" class=\"data row42 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", - " <td id=\"T_0502a_row42_col3\" class=\"data row42 col3\" >True</td>\n", - " <td id=\"T_0502a_row42_col4\" class=\"data row42 col4\" >False</td>\n", - " <td id=\"T_0502a_row42_col5\" class=\"data row42 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row42_col6\" class=\"data row42 col6\" >{}</td>\n", - " <td id=\"T_0502a_row42_col7\" class=\"data row42 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row43_col0\" class=\"data row43 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", - " <td id=\"T_0502a_row43_col1\" class=\"data row43 col1\" >Target Rate Bar Plots</td>\n", - " <td id=\"T_0502a_row43_col2\" class=\"data row43 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", - " <td id=\"T_0502a_row43_col3\" class=\"data row43 col3\" >True</td>\n", - " <td id=\"T_0502a_row43_col4\" class=\"data row43 col4\" >False</td>\n", - " <td id=\"T_0502a_row43_col5\" class=\"data row43 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row43_col6\" class=\"data row43 col6\" >{}</td>\n", - " <td id=\"T_0502a_row43_col7\" class=\"data row43 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row43_col8\" class=\"data row43 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row44_col0\" class=\"data row44 col0\" >validmind.data_validation.TimeSeriesDescription</td>\n", - " <td id=\"T_0502a_row44_col1\" class=\"data row44 col1\" >Time Series Description</td>\n", - " <td id=\"T_0502a_row44_col2\" class=\"data row44 col2\" >Generates a detailed analysis for the provided time series dataset, summarizing key statistics to identify trends,...</td>\n", - " <td id=\"T_0502a_row44_col3\" class=\"data row44 col3\" >False</td>\n", - " <td id=\"T_0502a_row44_col4\" class=\"data row44 col4\" >True</td>\n", - " <td id=\"T_0502a_row44_col5\" class=\"data row44 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row44_col6\" class=\"data row44 col6\" >{}</td>\n", - " <td id=\"T_0502a_row44_col7\" class=\"data row44 col7\" >['time_series_data', 'analysis']</td>\n", - " <td id=\"T_0502a_row44_col8\" class=\"data row44 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row45_col0\" class=\"data row45 col0\" >validmind.data_validation.TimeSeriesDescriptiveStatistics</td>\n", - " <td id=\"T_0502a_row45_col1\" class=\"data row45 col1\" >Time Series Descriptive Statistics</td>\n", - " <td id=\"T_0502a_row45_col2\" class=\"data row45 col2\" >Evaluates the descriptive statistics of a time series dataset to identify trends, patterns, and data quality issues....</td>\n", - " <td id=\"T_0502a_row45_col3\" class=\"data row45 col3\" >False</td>\n", - " <td id=\"T_0502a_row45_col4\" class=\"data row45 col4\" >True</td>\n", - " <td id=\"T_0502a_row45_col5\" class=\"data row45 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row45_col6\" class=\"data row45 col6\" >{}</td>\n", - " <td id=\"T_0502a_row45_col7\" class=\"data row45 col7\" >['time_series_data', 'analysis']</td>\n", - " <td id=\"T_0502a_row45_col8\" class=\"data row45 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row46_col0\" class=\"data row46 col0\" >validmind.data_validation.TimeSeriesFrequency</td>\n", - " <td id=\"T_0502a_row46_col1\" class=\"data row46 col1\" >Time Series Frequency</td>\n", - " <td id=\"T_0502a_row46_col2\" class=\"data row46 col2\" >Evaluates consistency of time series data frequency and generates a frequency plot....</td>\n", - " <td id=\"T_0502a_row46_col3\" class=\"data row46 col3\" >True</td>\n", - " <td id=\"T_0502a_row46_col4\" class=\"data row46 col4\" >True</td>\n", - " <td id=\"T_0502a_row46_col5\" class=\"data row46 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row46_col6\" class=\"data row46 col6\" >{}</td>\n", - " <td id=\"T_0502a_row46_col7\" class=\"data row46 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row46_col8\" class=\"data row46 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row47_col0\" class=\"data row47 col0\" >validmind.data_validation.TimeSeriesHistogram</td>\n", - " <td id=\"T_0502a_row47_col1\" class=\"data row47 col1\" >Time Series Histogram</td>\n", - " <td id=\"T_0502a_row47_col2\" class=\"data row47 col2\" >Visualizes distribution of time-series data using histograms and Kernel Density Estimation (KDE) lines....</td>\n", - " <td id=\"T_0502a_row47_col3\" class=\"data row47 col3\" >True</td>\n", - " <td id=\"T_0502a_row47_col4\" class=\"data row47 col4\" >False</td>\n", - " <td id=\"T_0502a_row47_col5\" class=\"data row47 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row47_col6\" class=\"data row47 col6\" >{'nbins': {'type': '_empty', 'default': 30}}</td>\n", - " <td id=\"T_0502a_row47_col7\" class=\"data row47 col7\" >['data_validation', 'visualization', 'time_series_data']</td>\n", - " <td id=\"T_0502a_row47_col8\" class=\"data row47 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row48_col0\" class=\"data row48 col0\" >validmind.data_validation.TimeSeriesLinePlot</td>\n", - " <td id=\"T_0502a_row48_col1\" class=\"data row48 col1\" >Time Series Line Plot</td>\n", - " <td id=\"T_0502a_row48_col2\" class=\"data row48 col2\" >Generates and analyses time-series data through line plots revealing trends, patterns, anomalies over time....</td>\n", - " <td id=\"T_0502a_row48_col3\" class=\"data row48 col3\" >True</td>\n", - " <td id=\"T_0502a_row48_col4\" class=\"data row48 col4\" >False</td>\n", - " <td id=\"T_0502a_row48_col5\" class=\"data row48 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row48_col6\" class=\"data row48 col6\" >{}</td>\n", - " <td id=\"T_0502a_row48_col7\" class=\"data row48 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row48_col8\" class=\"data row48 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row49_col0\" class=\"data row49 col0\" >validmind.data_validation.TimeSeriesMissingValues</td>\n", - " <td id=\"T_0502a_row49_col1\" class=\"data row49 col1\" >Time Series Missing Values</td>\n", - " <td id=\"T_0502a_row49_col2\" class=\"data row49 col2\" >Validates time-series data quality by confirming the count of missing values is below a certain threshold....</td>\n", - " <td id=\"T_0502a_row49_col3\" class=\"data row49 col3\" >True</td>\n", - " <td id=\"T_0502a_row49_col4\" class=\"data row49 col4\" >True</td>\n", - " <td id=\"T_0502a_row49_col5\" class=\"data row49 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row49_col6\" class=\"data row49 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row49_col7\" class=\"data row49 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row49_col8\" class=\"data row49 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row50_col0\" class=\"data row50 col0\" >validmind.data_validation.TimeSeriesOutliers</td>\n", - " <td id=\"T_0502a_row50_col1\" class=\"data row50 col1\" >Time Series Outliers</td>\n", - " <td id=\"T_0502a_row50_col2\" class=\"data row50 col2\" >Identifies and visualizes outliers in time-series data using the z-score method....</td>\n", - " <td id=\"T_0502a_row50_col3\" class=\"data row50 col3\" >False</td>\n", - " <td id=\"T_0502a_row50_col4\" class=\"data row50 col4\" >True</td>\n", - " <td id=\"T_0502a_row50_col5\" class=\"data row50 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row50_col6\" class=\"data row50 col6\" >{'zscore_threshold': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row50_col7\" class=\"data row50 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row50_col8\" class=\"data row50 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row51_col0\" class=\"data row51 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_0502a_row51_col1\" class=\"data row51 col1\" >Too Many Zero Values</td>\n", - " <td id=\"T_0502a_row51_col2\" class=\"data row51 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", - " <td id=\"T_0502a_row51_col3\" class=\"data row51 col3\" >False</td>\n", - " <td id=\"T_0502a_row51_col4\" class=\"data row51 col4\" >True</td>\n", - " <td id=\"T_0502a_row51_col5\" class=\"data row51 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row51_col6\" class=\"data row51 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", - " <td id=\"T_0502a_row51_col7\" class=\"data row51 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row51_col8\" class=\"data row51 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row52_col0\" class=\"data row52 col0\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_0502a_row52_col1\" class=\"data row52 col1\" >Unique Rows</td>\n", - " <td id=\"T_0502a_row52_col2\" class=\"data row52 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", - " <td id=\"T_0502a_row52_col3\" class=\"data row52 col3\" >False</td>\n", - " <td id=\"T_0502a_row52_col4\" class=\"data row52 col4\" >True</td>\n", - " <td id=\"T_0502a_row52_col5\" class=\"data row52 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row52_col6\" class=\"data row52 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row52_col7\" class=\"data row52 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row52_col8\" class=\"data row52 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row53_col0\" class=\"data row53 col0\" >validmind.data_validation.WOEBinPlots</td>\n", - " <td id=\"T_0502a_row53_col1\" class=\"data row53 col1\" >WOE Bin Plots</td>\n", - " <td id=\"T_0502a_row53_col2\" class=\"data row53 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", - " <td id=\"T_0502a_row53_col3\" class=\"data row53 col3\" >True</td>\n", - " <td id=\"T_0502a_row53_col4\" class=\"data row53 col4\" >False</td>\n", - " <td id=\"T_0502a_row53_col5\" class=\"data row53 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row53_col6\" class=\"data row53 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row53_col7\" class=\"data row53 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row53_col8\" class=\"data row53 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row54_col0\" class=\"data row54 col0\" >validmind.data_validation.WOEBinTable</td>\n", - " <td id=\"T_0502a_row54_col1\" class=\"data row54 col1\" >WOE Bin Table</td>\n", - " <td id=\"T_0502a_row54_col2\" class=\"data row54 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", - " <td id=\"T_0502a_row54_col3\" class=\"data row54 col3\" >False</td>\n", - " <td id=\"T_0502a_row54_col4\" class=\"data row54 col4\" >True</td>\n", - " <td id=\"T_0502a_row54_col5\" class=\"data row54 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row54_col6\" class=\"data row54 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row54_col7\" class=\"data row54 col7\" >['tabular_data', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row54_col8\" class=\"data row54 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row55_col0\" class=\"data row55 col0\" >validmind.data_validation.ZivotAndrewsArch</td>\n", - " <td id=\"T_0502a_row55_col1\" class=\"data row55 col1\" >Zivot Andrews Arch</td>\n", - " <td id=\"T_0502a_row55_col2\" class=\"data row55 col2\" >Evaluates the order of integration and stationarity of time series data using the Zivot-Andrews unit root test....</td>\n", - " <td id=\"T_0502a_row55_col3\" class=\"data row55 col3\" >False</td>\n", - " <td id=\"T_0502a_row55_col4\" class=\"data row55 col4\" >True</td>\n", - " <td id=\"T_0502a_row55_col5\" class=\"data row55 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row55_col6\" class=\"data row55 col6\" >{}</td>\n", - " <td id=\"T_0502a_row55_col7\" class=\"data row55 col7\" >['time_series_data', 'stationarity', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row55_col8\" class=\"data row55 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row56_col0\" class=\"data row56 col0\" >validmind.data_validation.nlp.CommonWords</td>\n", - " <td id=\"T_0502a_row56_col1\" class=\"data row56 col1\" >Common Words</td>\n", - " <td id=\"T_0502a_row56_col2\" class=\"data row56 col2\" >Assesses the most frequent non-stopwords in a text column for identifying prevalent language patterns....</td>\n", - " <td id=\"T_0502a_row56_col3\" class=\"data row56 col3\" >True</td>\n", - " <td id=\"T_0502a_row56_col4\" class=\"data row56 col4\" >False</td>\n", - " <td id=\"T_0502a_row56_col5\" class=\"data row56 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row56_col6\" class=\"data row56 col6\" >{}</td>\n", - " <td id=\"T_0502a_row56_col7\" class=\"data row56 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row56_col8\" class=\"data row56 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row57_col0\" class=\"data row57 col0\" >validmind.data_validation.nlp.Hashtags</td>\n", - " <td id=\"T_0502a_row57_col1\" class=\"data row57 col1\" >Hashtags</td>\n", - " <td id=\"T_0502a_row57_col2\" class=\"data row57 col2\" >Assesses hashtag frequency in a text column, highlighting usage trends and potential dataset bias or spam....</td>\n", - " <td id=\"T_0502a_row57_col3\" class=\"data row57 col3\" >True</td>\n", - " <td id=\"T_0502a_row57_col4\" class=\"data row57 col4\" >False</td>\n", - " <td id=\"T_0502a_row57_col5\" class=\"data row57 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row57_col6\" class=\"data row57 col6\" >{'top_hashtags': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row57_col7\" class=\"data row57 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row57_col8\" class=\"data row57 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row58_col0\" class=\"data row58 col0\" >validmind.data_validation.nlp.LanguageDetection</td>\n", - " <td id=\"T_0502a_row58_col1\" class=\"data row58 col1\" >Language Detection</td>\n", - " <td id=\"T_0502a_row58_col2\" class=\"data row58 col2\" >Assesses the diversity of languages in a textual dataset by detecting and visualizing the distribution of languages....</td>\n", - " <td id=\"T_0502a_row58_col3\" class=\"data row58 col3\" >True</td>\n", - " <td id=\"T_0502a_row58_col4\" class=\"data row58 col4\" >False</td>\n", - " <td id=\"T_0502a_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row58_col6\" class=\"data row58 col6\" >{}</td>\n", - " <td id=\"T_0502a_row58_col7\" class=\"data row58 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row58_col8\" class=\"data row58 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row59_col0\" class=\"data row59 col0\" >validmind.data_validation.nlp.Mentions</td>\n", - " <td id=\"T_0502a_row59_col1\" class=\"data row59 col1\" >Mentions</td>\n", - " <td id=\"T_0502a_row59_col2\" class=\"data row59 col2\" >Calculates and visualizes frequencies of '@' prefixed mentions in a text-based dataset for NLP model analysis....</td>\n", - " <td id=\"T_0502a_row59_col3\" class=\"data row59 col3\" >True</td>\n", - " <td id=\"T_0502a_row59_col4\" class=\"data row59 col4\" >False</td>\n", - " <td id=\"T_0502a_row59_col5\" class=\"data row59 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row59_col6\" class=\"data row59 col6\" >{'top_mentions': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row59_col7\" class=\"data row59 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row59_col8\" class=\"data row59 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row60_col0\" class=\"data row60 col0\" >validmind.data_validation.nlp.PolarityAndSubjectivity</td>\n", - " <td id=\"T_0502a_row60_col1\" class=\"data row60 col1\" >Polarity And Subjectivity</td>\n", - " <td id=\"T_0502a_row60_col2\" class=\"data row60 col2\" >Analyzes the polarity and subjectivity of text data within a given dataset to visualize the sentiment distribution....</td>\n", - " <td id=\"T_0502a_row60_col3\" class=\"data row60 col3\" >True</td>\n", - " <td id=\"T_0502a_row60_col4\" class=\"data row60 col4\" >True</td>\n", - " <td id=\"T_0502a_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row60_col6\" class=\"data row60 col6\" >{'threshold_subjectivity': {'type': '_empty', 'default': 0.5}, 'threshold_polarity': {'type': '_empty', 'default': 0}}</td>\n", - " <td id=\"T_0502a_row60_col7\" class=\"data row60 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row60_col8\" class=\"data row60 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row61_col0\" class=\"data row61 col0\" >validmind.data_validation.nlp.Punctuations</td>\n", - " <td id=\"T_0502a_row61_col1\" class=\"data row61 col1\" >Punctuations</td>\n", - " <td id=\"T_0502a_row61_col2\" class=\"data row61 col2\" >Analyzes and visualizes the frequency distribution of punctuation usage in a given text dataset....</td>\n", - " <td id=\"T_0502a_row61_col3\" class=\"data row61 col3\" >True</td>\n", - " <td id=\"T_0502a_row61_col4\" class=\"data row61 col4\" >False</td>\n", - " <td id=\"T_0502a_row61_col5\" class=\"data row61 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row61_col6\" class=\"data row61 col6\" >{'count_mode': {'type': '_empty', 'default': 'token'}}</td>\n", - " <td id=\"T_0502a_row61_col7\" class=\"data row61 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row61_col8\" class=\"data row61 col8\" >['text_classification', 'text_summarization', 'nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row62_col0\" class=\"data row62 col0\" >validmind.data_validation.nlp.Sentiment</td>\n", - " <td id=\"T_0502a_row62_col1\" class=\"data row62 col1\" >Sentiment</td>\n", - " <td id=\"T_0502a_row62_col2\" class=\"data row62 col2\" >Analyzes the sentiment of text data within a dataset using the VADER sentiment analysis tool....</td>\n", - " <td id=\"T_0502a_row62_col3\" class=\"data row62 col3\" >True</td>\n", - " <td id=\"T_0502a_row62_col4\" class=\"data row62 col4\" >False</td>\n", - " <td id=\"T_0502a_row62_col5\" class=\"data row62 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row62_col6\" class=\"data row62 col6\" >{}</td>\n", - " <td id=\"T_0502a_row62_col7\" class=\"data row62 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row62_col8\" class=\"data row62 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row63_col0\" class=\"data row63 col0\" >validmind.data_validation.nlp.StopWords</td>\n", - " <td id=\"T_0502a_row63_col1\" class=\"data row63 col1\" >Stop Words</td>\n", - " <td id=\"T_0502a_row63_col2\" class=\"data row63 col2\" >Evaluates and visualizes the frequency of English stop words in a text dataset against a defined threshold....</td>\n", - " <td id=\"T_0502a_row63_col3\" class=\"data row63 col3\" >True</td>\n", - " <td id=\"T_0502a_row63_col4\" class=\"data row63 col4\" >True</td>\n", - " <td id=\"T_0502a_row63_col5\" class=\"data row63 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row63_col6\" class=\"data row63 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 0.5}, 'num_words': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row63_col7\" class=\"data row63 col7\" >['nlp', 'text_data', 'frequency_analysis', 'visualization']</td>\n", - " <td id=\"T_0502a_row63_col8\" class=\"data row63 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row64_col0\" class=\"data row64 col0\" >validmind.data_validation.nlp.TextDescription</td>\n", - " <td id=\"T_0502a_row64_col1\" class=\"data row64 col1\" >Text Description</td>\n", - " <td id=\"T_0502a_row64_col2\" class=\"data row64 col2\" >Conducts comprehensive textual analysis on a dataset using NLTK to evaluate various parameters and generate...</td>\n", - " <td id=\"T_0502a_row64_col3\" class=\"data row64 col3\" >True</td>\n", - " <td id=\"T_0502a_row64_col4\" class=\"data row64 col4\" >False</td>\n", - " <td id=\"T_0502a_row64_col5\" class=\"data row64 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row64_col6\" class=\"data row64 col6\" >{'unwanted_tokens': {'type': 'set', 'default': {'s', 'mrs', 'us', \"''\", ' ', 'ms', 'dr', 'dollar', '``', 'mr', \"'s\", \"s'\"}}, 'lang': {'type': 'str', 'default': 'english'}}</td>\n", - " <td id=\"T_0502a_row64_col7\" class=\"data row64 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row64_col8\" class=\"data row64 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row65_col0\" class=\"data row65 col0\" >validmind.data_validation.nlp.Toxicity</td>\n", - " <td id=\"T_0502a_row65_col1\" class=\"data row65 col1\" >Toxicity</td>\n", - " <td id=\"T_0502a_row65_col2\" class=\"data row65 col2\" >Assesses the toxicity of text data within a dataset to visualize the distribution of toxicity scores....</td>\n", - " <td id=\"T_0502a_row65_col3\" class=\"data row65 col3\" >True</td>\n", - " <td id=\"T_0502a_row65_col4\" class=\"data row65 col4\" >False</td>\n", - " <td id=\"T_0502a_row65_col5\" class=\"data row65 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row65_col6\" class=\"data row65 col6\" >{}</td>\n", - " <td id=\"T_0502a_row65_col7\" class=\"data row65 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row65_col8\" class=\"data row65 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row66_col0\" class=\"data row66 col0\" >validmind.model_validation.BertScore</td>\n", - " <td id=\"T_0502a_row66_col1\" class=\"data row66 col1\" >Bert Score</td>\n", - " <td id=\"T_0502a_row66_col2\" class=\"data row66 col2\" >Assesses the quality of machine-generated text using BERTScore metrics and visualizes results through histograms...</td>\n", - " <td id=\"T_0502a_row66_col3\" class=\"data row66 col3\" >True</td>\n", - " <td id=\"T_0502a_row66_col4\" class=\"data row66 col4\" >True</td>\n", - " <td id=\"T_0502a_row66_col5\" class=\"data row66 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row66_col6\" class=\"data row66 col6\" >{'evaluation_model': {'type': '_empty', 'default': 'distilbert-base-uncased'}}</td>\n", - " <td id=\"T_0502a_row66_col7\" class=\"data row66 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row66_col8\" class=\"data row66 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row67_col0\" class=\"data row67 col0\" >validmind.model_validation.BleuScore</td>\n", - " <td id=\"T_0502a_row67_col1\" class=\"data row67 col1\" >Bleu Score</td>\n", - " <td id=\"T_0502a_row67_col2\" class=\"data row67 col2\" >Evaluates the quality of machine-generated text using BLEU metrics and visualizes the results through histograms...</td>\n", - " <td id=\"T_0502a_row67_col3\" class=\"data row67 col3\" >True</td>\n", - " <td id=\"T_0502a_row67_col4\" class=\"data row67 col4\" >True</td>\n", - " <td id=\"T_0502a_row67_col5\" class=\"data row67 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row67_col6\" class=\"data row67 col6\" >{}</td>\n", - " <td id=\"T_0502a_row67_col7\" class=\"data row67 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row67_col8\" class=\"data row67 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row68_col0\" class=\"data row68 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " <td id=\"T_0502a_row68_col1\" class=\"data row68 col1\" >Cluster Size Distribution</td>\n", - " <td id=\"T_0502a_row68_col2\" class=\"data row68 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", - " <td id=\"T_0502a_row68_col3\" class=\"data row68 col3\" >True</td>\n", - " <td id=\"T_0502a_row68_col4\" class=\"data row68 col4\" >False</td>\n", - " <td id=\"T_0502a_row68_col5\" class=\"data row68 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row68_col6\" class=\"data row68 col6\" >{}</td>\n", - " <td id=\"T_0502a_row68_col7\" class=\"data row68 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row68_col8\" class=\"data row68 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row69_col0\" class=\"data row69 col0\" >validmind.model_validation.ContextualRecall</td>\n", - " <td id=\"T_0502a_row69_col1\" class=\"data row69 col1\" >Contextual Recall</td>\n", - " <td id=\"T_0502a_row69_col2\" class=\"data row69 col2\" >Evaluates a Natural Language Generation model's ability to generate contextually relevant and factually correct...</td>\n", - " <td id=\"T_0502a_row69_col3\" class=\"data row69 col3\" >True</td>\n", - " <td id=\"T_0502a_row69_col4\" class=\"data row69 col4\" >True</td>\n", - " <td id=\"T_0502a_row69_col5\" class=\"data row69 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row69_col6\" class=\"data row69 col6\" >{}</td>\n", - " <td id=\"T_0502a_row69_col7\" class=\"data row69 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row69_col8\" class=\"data row69 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row70_col0\" class=\"data row70 col0\" >validmind.model_validation.FeaturesAUC</td>\n", - " <td id=\"T_0502a_row70_col1\" class=\"data row70 col1\" >Features AUC</td>\n", - " <td id=\"T_0502a_row70_col2\" class=\"data row70 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", - " <td id=\"T_0502a_row70_col3\" class=\"data row70 col3\" >True</td>\n", - " <td id=\"T_0502a_row70_col4\" class=\"data row70 col4\" >False</td>\n", - " <td id=\"T_0502a_row70_col5\" class=\"data row70 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row70_col6\" class=\"data row70 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row70_col7\" class=\"data row70 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", - " <td id=\"T_0502a_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row71_col0\" class=\"data row71 col0\" >validmind.model_validation.MeteorScore</td>\n", - " <td id=\"T_0502a_row71_col1\" class=\"data row71 col1\" >Meteor Score</td>\n", - " <td id=\"T_0502a_row71_col2\" class=\"data row71 col2\" >Assesses the quality of machine-generated translations by comparing them to human-produced references using the...</td>\n", - " <td id=\"T_0502a_row71_col3\" class=\"data row71 col3\" >True</td>\n", - " <td id=\"T_0502a_row71_col4\" class=\"data row71 col4\" >True</td>\n", - " <td id=\"T_0502a_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row71_col6\" class=\"data row71 col6\" >{}</td>\n", - " <td id=\"T_0502a_row71_col7\" class=\"data row71 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row71_col8\" class=\"data row71 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row72_col0\" class=\"data row72 col0\" >validmind.model_validation.ModelMetadata</td>\n", - " <td id=\"T_0502a_row72_col1\" class=\"data row72 col1\" >Model Metadata</td>\n", - " <td id=\"T_0502a_row72_col2\" class=\"data row72 col2\" >Compare metadata of different models and generate a summary table with the results....</td>\n", - " <td id=\"T_0502a_row72_col3\" class=\"data row72 col3\" >False</td>\n", - " <td id=\"T_0502a_row72_col4\" class=\"data row72 col4\" >True</td>\n", - " <td id=\"T_0502a_row72_col5\" class=\"data row72 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row72_col6\" class=\"data row72 col6\" >{}</td>\n", - " <td id=\"T_0502a_row72_col7\" class=\"data row72 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_0502a_row72_col8\" class=\"data row72 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row73_col0\" class=\"data row73 col0\" >validmind.model_validation.ModelPredictionResiduals</td>\n", - " <td id=\"T_0502a_row73_col1\" class=\"data row73 col1\" >Model Prediction Residuals</td>\n", - " <td id=\"T_0502a_row73_col2\" class=\"data row73 col2\" >Assesses normality and behavior of residuals in regression models through visualization and statistical tests....</td>\n", - " <td id=\"T_0502a_row73_col3\" class=\"data row73 col3\" >True</td>\n", - " <td id=\"T_0502a_row73_col4\" class=\"data row73 col4\" >True</td>\n", - " <td id=\"T_0502a_row73_col5\" class=\"data row73 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row73_col6\" class=\"data row73 col6\" >{'nbins': {'type': 'int', 'default': 100}, 'p_value_threshold': {'type': 'float', 'default': 0.05}, 'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row73_col7\" class=\"data row73 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row73_col8\" class=\"data row73 col8\" >['residual_analysis', 'visualization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row74_col0\" class=\"data row74 col0\" >validmind.model_validation.RegardScore</td>\n", - " <td id=\"T_0502a_row74_col1\" class=\"data row74 col1\" >Regard Score</td>\n", - " <td id=\"T_0502a_row74_col2\" class=\"data row74 col2\" >Assesses the sentiment and potential biases in text generated by NLP models by computing and visualizing regard...</td>\n", - " <td id=\"T_0502a_row74_col3\" class=\"data row74 col3\" >True</td>\n", - " <td id=\"T_0502a_row74_col4\" class=\"data row74 col4\" >True</td>\n", - " <td id=\"T_0502a_row74_col5\" class=\"data row74 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row74_col6\" class=\"data row74 col6\" >{}</td>\n", - " <td id=\"T_0502a_row74_col7\" class=\"data row74 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row74_col8\" class=\"data row74 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row75_col0\" class=\"data row75 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", - " <td id=\"T_0502a_row75_col1\" class=\"data row75 col1\" >Regression Residuals Plot</td>\n", - " <td id=\"T_0502a_row75_col2\" class=\"data row75 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", - " <td id=\"T_0502a_row75_col3\" class=\"data row75 col3\" >True</td>\n", - " <td id=\"T_0502a_row75_col4\" class=\"data row75 col4\" >False</td>\n", - " <td id=\"T_0502a_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row75_col6\" class=\"data row75 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_0502a_row75_col7\" class=\"data row75 col7\" >['model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row75_col8\" class=\"data row75 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row76_col0\" class=\"data row76 col0\" >validmind.model_validation.RougeScore</td>\n", - " <td id=\"T_0502a_row76_col1\" class=\"data row76 col1\" >Rouge Score</td>\n", - " <td id=\"T_0502a_row76_col2\" class=\"data row76 col2\" >Assesses the quality of machine-generated text using ROUGE metrics and visualizes the results to provide...</td>\n", - " <td id=\"T_0502a_row76_col3\" class=\"data row76 col3\" >True</td>\n", - " <td id=\"T_0502a_row76_col4\" class=\"data row76 col4\" >True</td>\n", - " <td id=\"T_0502a_row76_col5\" class=\"data row76 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row76_col6\" class=\"data row76 col6\" >{'metric': {'type': 'str', 'default': 'rouge-1'}}</td>\n", - " <td id=\"T_0502a_row76_col7\" class=\"data row76 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row76_col8\" class=\"data row76 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row77_col0\" class=\"data row77 col0\" >validmind.model_validation.TimeSeriesPredictionWithCI</td>\n", - " <td id=\"T_0502a_row77_col1\" class=\"data row77 col1\" >Time Series Prediction With CI</td>\n", - " <td id=\"T_0502a_row77_col2\" class=\"data row77 col2\" >Assesses predictive accuracy and uncertainty in time series models, highlighting breaches beyond confidence...</td>\n", - " <td id=\"T_0502a_row77_col3\" class=\"data row77 col3\" >True</td>\n", - " <td id=\"T_0502a_row77_col4\" class=\"data row77 col4\" >True</td>\n", - " <td id=\"T_0502a_row77_col5\" class=\"data row77 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row77_col6\" class=\"data row77 col6\" >{'confidence': {'type': 'float', 'default': 0.95}}</td>\n", - " <td id=\"T_0502a_row77_col7\" class=\"data row77 col7\" >['model_predictions', 'visualization']</td>\n", - " <td id=\"T_0502a_row77_col8\" class=\"data row77 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row78_col0\" class=\"data row78 col0\" >validmind.model_validation.TimeSeriesPredictionsPlot</td>\n", - " <td id=\"T_0502a_row78_col1\" class=\"data row78 col1\" >Time Series Predictions Plot</td>\n", - " <td id=\"T_0502a_row78_col2\" class=\"data row78 col2\" >Plot actual vs predicted values for time series data and generate a visual comparison for the model....</td>\n", - " <td id=\"T_0502a_row78_col3\" class=\"data row78 col3\" >True</td>\n", - " <td id=\"T_0502a_row78_col4\" class=\"data row78 col4\" >False</td>\n", - " <td id=\"T_0502a_row78_col5\" class=\"data row78 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row78_col6\" class=\"data row78 col6\" >{}</td>\n", - " <td id=\"T_0502a_row78_col7\" class=\"data row78 col7\" >['model_predictions', 'visualization']</td>\n", - " <td id=\"T_0502a_row78_col8\" class=\"data row78 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row79_col0\" class=\"data row79 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", - " <td id=\"T_0502a_row79_col1\" class=\"data row79 col1\" >Time Series R2 Square By Segments</td>\n", - " <td id=\"T_0502a_row79_col2\" class=\"data row79 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", - " <td id=\"T_0502a_row79_col3\" class=\"data row79 col3\" >True</td>\n", - " <td id=\"T_0502a_row79_col4\" class=\"data row79 col4\" >True</td>\n", - " <td id=\"T_0502a_row79_col5\" class=\"data row79 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row79_col6\" class=\"data row79 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row79_col7\" class=\"data row79 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row79_col8\" class=\"data row79 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row80_col0\" class=\"data row80 col0\" >validmind.model_validation.TokenDisparity</td>\n", - " <td id=\"T_0502a_row80_col1\" class=\"data row80 col1\" >Token Disparity</td>\n", - " <td id=\"T_0502a_row80_col2\" class=\"data row80 col2\" >Evaluates the token disparity between reference and generated texts, visualizing the results through histograms and...</td>\n", - " <td id=\"T_0502a_row80_col3\" class=\"data row80 col3\" >True</td>\n", - " <td id=\"T_0502a_row80_col4\" class=\"data row80 col4\" >True</td>\n", - " <td id=\"T_0502a_row80_col5\" class=\"data row80 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row80_col6\" class=\"data row80 col6\" >{}</td>\n", - " <td id=\"T_0502a_row80_col7\" class=\"data row80 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row80_col8\" class=\"data row80 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row81_col0\" class=\"data row81 col0\" >validmind.model_validation.ToxicityScore</td>\n", - " <td id=\"T_0502a_row81_col1\" class=\"data row81 col1\" >Toxicity Score</td>\n", - " <td id=\"T_0502a_row81_col2\" class=\"data row81 col2\" >Assesses the toxicity levels of texts generated by NLP models to identify and mitigate harmful or offensive content....</td>\n", - " <td id=\"T_0502a_row81_col3\" class=\"data row81 col3\" >True</td>\n", - " <td id=\"T_0502a_row81_col4\" class=\"data row81 col4\" >True</td>\n", - " <td id=\"T_0502a_row81_col5\" class=\"data row81 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row81_col6\" class=\"data row81 col6\" >{}</td>\n", - " <td id=\"T_0502a_row81_col7\" class=\"data row81 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row81_col8\" class=\"data row81 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row82_col0\" class=\"data row82 col0\" >validmind.model_validation.embeddings.ClusterDistribution</td>\n", - " <td id=\"T_0502a_row82_col1\" class=\"data row82 col1\" >Cluster Distribution</td>\n", - " <td id=\"T_0502a_row82_col2\" class=\"data row82 col2\" >Assesses the distribution of text embeddings across clusters produced by a model using KMeans clustering....</td>\n", - " <td id=\"T_0502a_row82_col3\" class=\"data row82 col3\" >True</td>\n", - " <td id=\"T_0502a_row82_col4\" class=\"data row82 col4\" >False</td>\n", - " <td id=\"T_0502a_row82_col5\" class=\"data row82 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row82_col6\" class=\"data row82 col6\" >{'num_clusters': {'type': 'int', 'default': 5}}</td>\n", - " <td id=\"T_0502a_row82_col7\" class=\"data row82 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row82_col8\" class=\"data row82 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row83_col0\" class=\"data row83 col0\" >validmind.model_validation.embeddings.CosineSimilarityComparison</td>\n", - " <td id=\"T_0502a_row83_col1\" class=\"data row83 col1\" >Cosine Similarity Comparison</td>\n", - " <td id=\"T_0502a_row83_col2\" class=\"data row83 col2\" >Assesses the similarity between embeddings generated by different models using Cosine Similarity, providing both...</td>\n", - " <td id=\"T_0502a_row83_col3\" class=\"data row83 col3\" >True</td>\n", - " <td id=\"T_0502a_row83_col4\" class=\"data row83 col4\" >True</td>\n", - " <td id=\"T_0502a_row83_col5\" class=\"data row83 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row83_col6\" class=\"data row83 col6\" >{}</td>\n", - " <td id=\"T_0502a_row83_col7\" class=\"data row83 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row83_col8\" class=\"data row83 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row84_col0\" class=\"data row84 col0\" >validmind.model_validation.embeddings.CosineSimilarityDistribution</td>\n", - " <td id=\"T_0502a_row84_col1\" class=\"data row84 col1\" >Cosine Similarity Distribution</td>\n", - " <td id=\"T_0502a_row84_col2\" class=\"data row84 col2\" >Assesses the similarity between predicted text embeddings from a model using a Cosine Similarity distribution...</td>\n", - " <td id=\"T_0502a_row84_col3\" class=\"data row84 col3\" >True</td>\n", - " <td id=\"T_0502a_row84_col4\" class=\"data row84 col4\" >False</td>\n", - " <td id=\"T_0502a_row84_col5\" class=\"data row84 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row84_col6\" class=\"data row84 col6\" >{}</td>\n", - " <td id=\"T_0502a_row84_col7\" class=\"data row84 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row84_col8\" class=\"data row84 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row85_col0\" class=\"data row85 col0\" >validmind.model_validation.embeddings.CosineSimilarityHeatmap</td>\n", - " <td id=\"T_0502a_row85_col1\" class=\"data row85 col1\" >Cosine Similarity Heatmap</td>\n", - " <td id=\"T_0502a_row85_col2\" class=\"data row85 col2\" >Generates an interactive heatmap to visualize the cosine similarities among embeddings derived from a given model....</td>\n", - " <td id=\"T_0502a_row85_col3\" class=\"data row85 col3\" >True</td>\n", - " <td id=\"T_0502a_row85_col4\" class=\"data row85 col4\" >False</td>\n", - " <td id=\"T_0502a_row85_col5\" class=\"data row85 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row85_col6\" class=\"data row85 col6\" >{'title': {'type': '_empty', 'default': 'Cosine Similarity Matrix'}, 'color': {'type': '_empty', 'default': 'Cosine Similarity'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", - " <td id=\"T_0502a_row85_col7\" class=\"data row85 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row85_col8\" class=\"data row85 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row86_col0\" class=\"data row86 col0\" >validmind.model_validation.embeddings.DescriptiveAnalytics</td>\n", - " <td id=\"T_0502a_row86_col1\" class=\"data row86 col1\" >Descriptive Analytics</td>\n", - " <td id=\"T_0502a_row86_col2\" class=\"data row86 col2\" >Evaluates statistical properties of text embeddings in an ML model via mean, median, and standard deviation...</td>\n", - " <td id=\"T_0502a_row86_col3\" class=\"data row86 col3\" >True</td>\n", - " <td id=\"T_0502a_row86_col4\" class=\"data row86 col4\" >False</td>\n", - " <td id=\"T_0502a_row86_col5\" class=\"data row86 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row86_col6\" class=\"data row86 col6\" >{}</td>\n", - " <td id=\"T_0502a_row86_col7\" class=\"data row86 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row86_col8\" class=\"data row86 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row87_col0\" class=\"data row87 col0\" >validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", - " <td id=\"T_0502a_row87_col1\" class=\"data row87 col1\" >Embeddings Visualization2 D</td>\n", - " <td id=\"T_0502a_row87_col2\" class=\"data row87 col2\" >Visualizes 2D representation of text embeddings generated by a model using t-SNE technique....</td>\n", - " <td id=\"T_0502a_row87_col3\" class=\"data row87 col3\" >True</td>\n", - " <td id=\"T_0502a_row87_col4\" class=\"data row87 col4\" >False</td>\n", - " <td id=\"T_0502a_row87_col5\" class=\"data row87 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row87_col6\" class=\"data row87 col6\" >{'cluster_column': {'type': None, 'default': None}, 'perplexity': {'type': 'int', 'default': 30}}</td>\n", - " <td id=\"T_0502a_row87_col7\" class=\"data row87 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row87_col8\" class=\"data row87 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row88_col0\" class=\"data row88 col0\" >validmind.model_validation.embeddings.EuclideanDistanceComparison</td>\n", - " <td id=\"T_0502a_row88_col1\" class=\"data row88 col1\" >Euclidean Distance Comparison</td>\n", - " <td id=\"T_0502a_row88_col2\" class=\"data row88 col2\" >Assesses and visualizes the dissimilarity between model embeddings using Euclidean distance, providing insights...</td>\n", - " <td id=\"T_0502a_row88_col3\" class=\"data row88 col3\" >True</td>\n", - " <td id=\"T_0502a_row88_col4\" class=\"data row88 col4\" >True</td>\n", - " <td id=\"T_0502a_row88_col5\" class=\"data row88 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row88_col6\" class=\"data row88 col6\" >{}</td>\n", - " <td id=\"T_0502a_row88_col7\" class=\"data row88 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row88_col8\" class=\"data row88 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row89_col0\" class=\"data row89 col0\" >validmind.model_validation.embeddings.EuclideanDistanceHeatmap</td>\n", - " <td id=\"T_0502a_row89_col1\" class=\"data row89 col1\" >Euclidean Distance Heatmap</td>\n", - " <td id=\"T_0502a_row89_col2\" class=\"data row89 col2\" >Generates an interactive heatmap to visualize the Euclidean distances among embeddings derived from a given model....</td>\n", - " <td id=\"T_0502a_row89_col3\" class=\"data row89 col3\" >True</td>\n", - " <td id=\"T_0502a_row89_col4\" class=\"data row89 col4\" >False</td>\n", - " <td id=\"T_0502a_row89_col5\" class=\"data row89 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row89_col6\" class=\"data row89 col6\" >{'title': {'type': '_empty', 'default': 'Euclidean Distance Matrix'}, 'color': {'type': '_empty', 'default': 'Euclidean Distance'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", - " <td id=\"T_0502a_row89_col7\" class=\"data row89 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row89_col8\" class=\"data row89 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row90_col0\" class=\"data row90 col0\" >validmind.model_validation.embeddings.PCAComponentsPairwisePlots</td>\n", - " <td id=\"T_0502a_row90_col1\" class=\"data row90 col1\" >PCA Components Pairwise Plots</td>\n", - " <td id=\"T_0502a_row90_col2\" class=\"data row90 col2\" >Generates scatter plots for pairwise combinations of principal component analysis (PCA) components of model...</td>\n", - " <td id=\"T_0502a_row90_col3\" class=\"data row90 col3\" >True</td>\n", - " <td id=\"T_0502a_row90_col4\" class=\"data row90 col4\" >False</td>\n", - " <td id=\"T_0502a_row90_col5\" class=\"data row90 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row90_col6\" class=\"data row90 col6\" >{'n_components': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row90_col7\" class=\"data row90 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row90_col8\" class=\"data row90 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row91_col0\" class=\"data row91 col0\" >validmind.model_validation.embeddings.StabilityAnalysisKeyword</td>\n", - " <td id=\"T_0502a_row91_col1\" class=\"data row91 col1\" >Stability Analysis Keyword</td>\n", - " <td id=\"T_0502a_row91_col2\" class=\"data row91 col2\" >Evaluates robustness of embedding models to keyword swaps in the test dataset....</td>\n", - " <td id=\"T_0502a_row91_col3\" class=\"data row91 col3\" >True</td>\n", - " <td id=\"T_0502a_row91_col4\" class=\"data row91 col4\" >True</td>\n", - " <td id=\"T_0502a_row91_col5\" class=\"data row91 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row91_col6\" class=\"data row91 col6\" >{'keyword_dict': {'type': None, 'default': None}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row91_col7\" class=\"data row91 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row91_col8\" class=\"data row91 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row92_col0\" class=\"data row92 col0\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise</td>\n", - " <td id=\"T_0502a_row92_col1\" class=\"data row92 col1\" >Stability Analysis Random Noise</td>\n", - " <td id=\"T_0502a_row92_col2\" class=\"data row92 col2\" >Assesses the robustness of text embeddings models to random noise introduced via text perturbations....</td>\n", - " <td id=\"T_0502a_row92_col3\" class=\"data row92 col3\" >True</td>\n", - " <td id=\"T_0502a_row92_col4\" class=\"data row92 col4\" >True</td>\n", - " <td id=\"T_0502a_row92_col5\" class=\"data row92 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row92_col6\" class=\"data row92 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row92_col7\" class=\"data row92 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row92_col8\" class=\"data row92 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row93_col0\" class=\"data row93 col0\" >validmind.model_validation.embeddings.StabilityAnalysisSynonyms</td>\n", - " <td id=\"T_0502a_row93_col1\" class=\"data row93 col1\" >Stability Analysis Synonyms</td>\n", - " <td id=\"T_0502a_row93_col2\" class=\"data row93 col2\" >Evaluates the stability of text embeddings models when words in test data are replaced by their synonyms randomly....</td>\n", - " <td id=\"T_0502a_row93_col3\" class=\"data row93 col3\" >True</td>\n", - " <td id=\"T_0502a_row93_col4\" class=\"data row93 col4\" >True</td>\n", - " <td id=\"T_0502a_row93_col5\" class=\"data row93 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row93_col6\" class=\"data row93 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row93_col7\" class=\"data row93 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row93_col8\" class=\"data row93 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row94_col0\" class=\"data row94 col0\" >validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " <td id=\"T_0502a_row94_col1\" class=\"data row94 col1\" >Stability Analysis Translation</td>\n", - " <td id=\"T_0502a_row94_col2\" class=\"data row94 col2\" >Evaluates robustness of text embeddings models to noise introduced by translating the original text to another...</td>\n", - " <td id=\"T_0502a_row94_col3\" class=\"data row94 col3\" >True</td>\n", - " <td id=\"T_0502a_row94_col4\" class=\"data row94 col4\" >True</td>\n", - " <td id=\"T_0502a_row94_col5\" class=\"data row94 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row94_col6\" class=\"data row94 col6\" >{'source_lang': {'type': 'str', 'default': 'en'}, 'target_lang': {'type': 'str', 'default': 'fr'}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row94_col7\" class=\"data row94 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row94_col8\" class=\"data row94 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row95_col0\" class=\"data row95 col0\" >validmind.model_validation.embeddings.TSNEComponentsPairwisePlots</td>\n", - " <td id=\"T_0502a_row95_col1\" class=\"data row95 col1\" >TSNE Components Pairwise Plots</td>\n", - " <td id=\"T_0502a_row95_col2\" class=\"data row95 col2\" >Creates scatter plots for pairwise combinations of t-SNE components to visualize embeddings and highlight potential...</td>\n", - " <td id=\"T_0502a_row95_col3\" class=\"data row95 col3\" >True</td>\n", - " <td id=\"T_0502a_row95_col4\" class=\"data row95 col4\" >False</td>\n", - " <td id=\"T_0502a_row95_col5\" class=\"data row95 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row95_col6\" class=\"data row95 col6\" >{'n_components': {'type': 'int', 'default': 2}, 'perplexity': {'type': 'int', 'default': 30}, 'title': {'type': 'str', 'default': 't-SNE'}}</td>\n", - " <td id=\"T_0502a_row95_col7\" class=\"data row95 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row95_col8\" class=\"data row95 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row96_col0\" class=\"data row96 col0\" >validmind.model_validation.ragas.AnswerCorrectness</td>\n", - " <td id=\"T_0502a_row96_col1\" class=\"data row96 col1\" >Answer Correctness</td>\n", - " <td id=\"T_0502a_row96_col2\" class=\"data row96 col2\" >Evaluates the correctness of answers in a dataset with respect to the provided ground...</td>\n", - " <td id=\"T_0502a_row96_col3\" class=\"data row96 col3\" >True</td>\n", - " <td id=\"T_0502a_row96_col4\" class=\"data row96 col4\" >True</td>\n", - " <td id=\"T_0502a_row96_col5\" class=\"data row96 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row96_col6\" class=\"data row96 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row96_col7\" class=\"data row96 col7\" >['ragas', 'llm']</td>\n", - " <td id=\"T_0502a_row96_col8\" class=\"data row96 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row97_col0\" class=\"data row97 col0\" >validmind.model_validation.ragas.AspectCritic</td>\n", - " <td id=\"T_0502a_row97_col1\" class=\"data row97 col1\" >Aspect Critic</td>\n", - " <td id=\"T_0502a_row97_col2\" class=\"data row97 col2\" >Evaluates generations against the following aspects: harmfulness, maliciousness,...</td>\n", - " <td id=\"T_0502a_row97_col3\" class=\"data row97 col3\" >True</td>\n", - " <td id=\"T_0502a_row97_col4\" class=\"data row97 col4\" >True</td>\n", - " <td id=\"T_0502a_row97_col5\" class=\"data row97 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row97_col6\" class=\"data row97 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': None, 'default': None}, 'aspects': {'type': None, 'default': ['coherence', 'conciseness', 'correctness', 'harmfulness', 'maliciousness']}, 'additional_aspects': {'type': None, 'default': None}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row97_col7\" class=\"data row97 col7\" >['ragas', 'llm', 'qualitative']</td>\n", - " <td id=\"T_0502a_row97_col8\" class=\"data row97 col8\" >['text_summarization', 'text_generation', 'text_qa']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row98_col0\" class=\"data row98 col0\" >validmind.model_validation.ragas.ContextEntityRecall</td>\n", - " <td id=\"T_0502a_row98_col1\" class=\"data row98 col1\" >Context Entity Recall</td>\n", - " <td id=\"T_0502a_row98_col2\" class=\"data row98 col2\" >Evaluates the context entity recall for dataset entries and visualizes the results....</td>\n", - " <td id=\"T_0502a_row98_col3\" class=\"data row98 col3\" >True</td>\n", - " <td id=\"T_0502a_row98_col4\" class=\"data row98 col4\" >True</td>\n", - " <td id=\"T_0502a_row98_col5\" class=\"data row98 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row98_col6\" class=\"data row98 col6\" >{'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row98_col7\" class=\"data row98 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row98_col8\" class=\"data row98 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row99_col0\" class=\"data row99 col0\" >validmind.model_validation.ragas.ContextPrecision</td>\n", - " <td id=\"T_0502a_row99_col1\" class=\"data row99 col1\" >Context Precision</td>\n", - " <td id=\"T_0502a_row99_col2\" class=\"data row99 col2\" >Context Precision is a metric that evaluates whether all of the ground-truth...</td>\n", - " <td id=\"T_0502a_row99_col3\" class=\"data row99 col3\" >True</td>\n", - " <td id=\"T_0502a_row99_col4\" class=\"data row99 col4\" >True</td>\n", - " <td id=\"T_0502a_row99_col5\" class=\"data row99 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row99_col6\" class=\"data row99 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row99_col7\" class=\"data row99 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row99_col8\" class=\"data row99 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row100_col0\" class=\"data row100 col0\" >validmind.model_validation.ragas.ContextPrecisionWithoutReference</td>\n", - " <td id=\"T_0502a_row100_col1\" class=\"data row100 col1\" >Context Precision Without Reference</td>\n", - " <td id=\"T_0502a_row100_col2\" class=\"data row100 col2\" >Context Precision Without Reference is a metric used to evaluate the relevance of...</td>\n", - " <td id=\"T_0502a_row100_col3\" class=\"data row100 col3\" >True</td>\n", - " <td id=\"T_0502a_row100_col4\" class=\"data row100 col4\" >True</td>\n", - " <td id=\"T_0502a_row100_col5\" class=\"data row100 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row100_col6\" class=\"data row100 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row100_col7\" class=\"data row100 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row100_col8\" class=\"data row100 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row101_col0\" class=\"data row101 col0\" >validmind.model_validation.ragas.ContextRecall</td>\n", - " <td id=\"T_0502a_row101_col1\" class=\"data row101 col1\" >Context Recall</td>\n", - " <td id=\"T_0502a_row101_col2\" class=\"data row101 col2\" >Context recall measures the extent to which the retrieved context aligns with the...</td>\n", - " <td id=\"T_0502a_row101_col3\" class=\"data row101 col3\" >True</td>\n", - " <td id=\"T_0502a_row101_col4\" class=\"data row101 col4\" >True</td>\n", - " <td id=\"T_0502a_row101_col5\" class=\"data row101 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row101_col6\" class=\"data row101 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row101_col7\" class=\"data row101 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row101_col8\" class=\"data row101 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row102_col0\" class=\"data row102 col0\" >validmind.model_validation.ragas.Faithfulness</td>\n", - " <td id=\"T_0502a_row102_col1\" class=\"data row102 col1\" >Faithfulness</td>\n", - " <td id=\"T_0502a_row102_col2\" class=\"data row102 col2\" >Evaluates the faithfulness of the generated answers with respect to retrieved contexts....</td>\n", - " <td id=\"T_0502a_row102_col3\" class=\"data row102 col3\" >True</td>\n", - " <td id=\"T_0502a_row102_col4\" class=\"data row102 col4\" >True</td>\n", - " <td id=\"T_0502a_row102_col5\" class=\"data row102 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row102_col6\" class=\"data row102 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row102_col7\" class=\"data row102 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row102_col8\" class=\"data row102 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row103_col0\" class=\"data row103 col0\" >validmind.model_validation.ragas.NoiseSensitivity</td>\n", - " <td id=\"T_0502a_row103_col1\" class=\"data row103 col1\" >Noise Sensitivity</td>\n", - " <td id=\"T_0502a_row103_col2\" class=\"data row103 col2\" >Assesses the sensitivity of a Large Language Model (LLM) to noise in retrieved context by measuring how often it...</td>\n", - " <td id=\"T_0502a_row103_col3\" class=\"data row103 col3\" >True</td>\n", - " <td id=\"T_0502a_row103_col4\" class=\"data row103 col4\" >True</td>\n", - " <td id=\"T_0502a_row103_col5\" class=\"data row103 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row103_col6\" class=\"data row103 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'focus': {'type': 'str', 'default': 'relevant'}, 'user_input_column': {'type': 'str', 'default': 'user_input'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row103_col7\" class=\"data row103 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row103_col8\" class=\"data row103 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row104_col0\" class=\"data row104 col0\" >validmind.model_validation.ragas.ResponseRelevancy</td>\n", - " <td id=\"T_0502a_row104_col1\" class=\"data row104 col1\" >Response Relevancy</td>\n", - " <td id=\"T_0502a_row104_col2\" class=\"data row104 col2\" >Assesses how pertinent the generated answer is to the given prompt....</td>\n", - " <td id=\"T_0502a_row104_col3\" class=\"data row104 col3\" >True</td>\n", - " <td id=\"T_0502a_row104_col4\" class=\"data row104 col4\" >True</td>\n", - " <td id=\"T_0502a_row104_col5\" class=\"data row104 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row104_col6\" class=\"data row104 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': None}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row104_col7\" class=\"data row104 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row104_col8\" class=\"data row104 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row105_col0\" class=\"data row105 col0\" >validmind.model_validation.ragas.SemanticSimilarity</td>\n", - " <td id=\"T_0502a_row105_col1\" class=\"data row105 col1\" >Semantic Similarity</td>\n", - " <td id=\"T_0502a_row105_col2\" class=\"data row105 col2\" >Calculates the semantic similarity between generated responses and ground truths...</td>\n", - " <td id=\"T_0502a_row105_col3\" class=\"data row105 col3\" >True</td>\n", - " <td id=\"T_0502a_row105_col4\" class=\"data row105 col4\" >True</td>\n", - " <td id=\"T_0502a_row105_col5\" class=\"data row105 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row105_col6\" class=\"data row105 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row105_col7\" class=\"data row105 col7\" >['ragas', 'llm']</td>\n", - " <td id=\"T_0502a_row105_col8\" class=\"data row105 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row106_col0\" class=\"data row106 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", - " <td id=\"T_0502a_row106_col1\" class=\"data row106 col1\" >Adjusted Mutual Information</td>\n", - " <td id=\"T_0502a_row106_col2\" class=\"data row106 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", - " <td id=\"T_0502a_row106_col3\" class=\"data row106 col3\" >False</td>\n", - " <td id=\"T_0502a_row106_col4\" class=\"data row106 col4\" >True</td>\n", - " <td id=\"T_0502a_row106_col5\" class=\"data row106 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row106_col6\" class=\"data row106 col6\" >{}</td>\n", - " <td id=\"T_0502a_row106_col7\" class=\"data row106 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row106_col8\" class=\"data row106 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row107_col0\" class=\"data row107 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", - " <td id=\"T_0502a_row107_col1\" class=\"data row107 col1\" >Adjusted Rand Index</td>\n", - " <td id=\"T_0502a_row107_col2\" class=\"data row107 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", - " <td id=\"T_0502a_row107_col3\" class=\"data row107 col3\" >False</td>\n", - " <td id=\"T_0502a_row107_col4\" class=\"data row107 col4\" >True</td>\n", - " <td id=\"T_0502a_row107_col5\" class=\"data row107 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row107_col6\" class=\"data row107 col6\" >{}</td>\n", - " <td id=\"T_0502a_row107_col7\" class=\"data row107 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row107_col8\" class=\"data row107 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row108_col0\" class=\"data row108 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_0502a_row108_col1\" class=\"data row108 col1\" >Calibration Curve</td>\n", - " <td id=\"T_0502a_row108_col2\" class=\"data row108 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_0502a_row108_col3\" class=\"data row108 col3\" >True</td>\n", - " <td id=\"T_0502a_row108_col4\" class=\"data row108 col4\" >False</td>\n", - " <td id=\"T_0502a_row108_col5\" class=\"data row108 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row108_col6\" class=\"data row108 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row108_col7\" class=\"data row108 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_0502a_row108_col8\" class=\"data row108 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row109_col0\" class=\"data row109 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_0502a_row109_col1\" class=\"data row109 col1\" >Classifier Performance</td>\n", - " <td id=\"T_0502a_row109_col2\" class=\"data row109 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_0502a_row109_col3\" class=\"data row109 col3\" >False</td>\n", - " <td id=\"T_0502a_row109_col4\" class=\"data row109 col4\" >True</td>\n", - " <td id=\"T_0502a_row109_col5\" class=\"data row109 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row109_col6\" class=\"data row109 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_0502a_row109_col7\" class=\"data row109 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row109_col8\" class=\"data row109 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row110_col0\" class=\"data row110 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_0502a_row110_col1\" class=\"data row110 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_0502a_row110_col2\" class=\"data row110 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_0502a_row110_col3\" class=\"data row110 col3\" >False</td>\n", - " <td id=\"T_0502a_row110_col4\" class=\"data row110 col4\" >True</td>\n", - " <td id=\"T_0502a_row110_col5\" class=\"data row110 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row110_col6\" class=\"data row110 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row110_col7\" class=\"data row110 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_0502a_row110_col8\" class=\"data row110 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row111_col0\" class=\"data row111 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", - " <td id=\"T_0502a_row111_col1\" class=\"data row111 col1\" >Cluster Cosine Similarity</td>\n", - " <td id=\"T_0502a_row111_col2\" class=\"data row111 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", - " <td id=\"T_0502a_row111_col3\" class=\"data row111 col3\" >False</td>\n", - " <td id=\"T_0502a_row111_col4\" class=\"data row111 col4\" >True</td>\n", - " <td id=\"T_0502a_row111_col5\" class=\"data row111 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row111_col6\" class=\"data row111 col6\" >{}</td>\n", - " <td id=\"T_0502a_row111_col7\" class=\"data row111 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row111_col8\" class=\"data row111 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row112_col0\" class=\"data row112 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", - " <td id=\"T_0502a_row112_col1\" class=\"data row112 col1\" >Cluster Performance Metrics</td>\n", - " <td id=\"T_0502a_row112_col2\" class=\"data row112 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", - " <td id=\"T_0502a_row112_col3\" class=\"data row112 col3\" >False</td>\n", - " <td id=\"T_0502a_row112_col4\" class=\"data row112 col4\" >True</td>\n", - " <td id=\"T_0502a_row112_col5\" class=\"data row112 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row112_col6\" class=\"data row112 col6\" >{}</td>\n", - " <td id=\"T_0502a_row112_col7\" class=\"data row112 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row112_col8\" class=\"data row112 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row113_col0\" class=\"data row113 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", - " <td id=\"T_0502a_row113_col1\" class=\"data row113 col1\" >Completeness Score</td>\n", - " <td id=\"T_0502a_row113_col2\" class=\"data row113 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", - " <td id=\"T_0502a_row113_col3\" class=\"data row113 col3\" >False</td>\n", - " <td id=\"T_0502a_row113_col4\" class=\"data row113 col4\" >True</td>\n", - " <td id=\"T_0502a_row113_col5\" class=\"data row113 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row113_col6\" class=\"data row113 col6\" >{}</td>\n", - " <td id=\"T_0502a_row113_col7\" class=\"data row113 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row113_col8\" class=\"data row113 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row114_col0\" class=\"data row114 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_0502a_row114_col1\" class=\"data row114 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_0502a_row114_col2\" class=\"data row114 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_0502a_row114_col3\" class=\"data row114 col3\" >True</td>\n", - " <td id=\"T_0502a_row114_col4\" class=\"data row114 col4\" >False</td>\n", - " <td id=\"T_0502a_row114_col5\" class=\"data row114 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row114_col6\" class=\"data row114 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row114_col7\" class=\"data row114 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row114_col8\" class=\"data row114 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row115_col0\" class=\"data row115 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", - " <td id=\"T_0502a_row115_col1\" class=\"data row115 col1\" >Feature Importance</td>\n", - " <td id=\"T_0502a_row115_col2\" class=\"data row115 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", - " <td id=\"T_0502a_row115_col3\" class=\"data row115 col3\" >False</td>\n", - " <td id=\"T_0502a_row115_col4\" class=\"data row115 col4\" >True</td>\n", - " <td id=\"T_0502a_row115_col5\" class=\"data row115 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row115_col6\" class=\"data row115 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row115_col7\" class=\"data row115 col7\" >['model_explainability', 'sklearn']</td>\n", - " <td id=\"T_0502a_row115_col8\" class=\"data row115 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row116_col0\" class=\"data row116 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", - " <td id=\"T_0502a_row116_col1\" class=\"data row116 col1\" >Fowlkes Mallows Score</td>\n", - " <td id=\"T_0502a_row116_col2\" class=\"data row116 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", - " <td id=\"T_0502a_row116_col3\" class=\"data row116 col3\" >False</td>\n", - " <td id=\"T_0502a_row116_col4\" class=\"data row116 col4\" >True</td>\n", - " <td id=\"T_0502a_row116_col5\" class=\"data row116 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row116_col6\" class=\"data row116 col6\" >{}</td>\n", - " <td id=\"T_0502a_row116_col7\" class=\"data row116 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row116_col8\" class=\"data row116 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row117_col0\" class=\"data row117 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", - " <td id=\"T_0502a_row117_col1\" class=\"data row117 col1\" >Homogeneity Score</td>\n", - " <td id=\"T_0502a_row117_col2\" class=\"data row117 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", - " <td id=\"T_0502a_row117_col3\" class=\"data row117 col3\" >False</td>\n", - " <td id=\"T_0502a_row117_col4\" class=\"data row117 col4\" >True</td>\n", - " <td id=\"T_0502a_row117_col5\" class=\"data row117 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row117_col6\" class=\"data row117 col6\" >{}</td>\n", - " <td id=\"T_0502a_row117_col7\" class=\"data row117 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row117_col8\" class=\"data row117 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row118_col0\" class=\"data row118 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_0502a_row118_col1\" class=\"data row118 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_0502a_row118_col2\" class=\"data row118 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_0502a_row118_col3\" class=\"data row118 col3\" >False</td>\n", - " <td id=\"T_0502a_row118_col4\" class=\"data row118 col4\" >True</td>\n", - " <td id=\"T_0502a_row118_col5\" class=\"data row118 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row118_col6\" class=\"data row118 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_0502a_row118_col7\" class=\"data row118 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row118_col8\" class=\"data row118 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row119_col0\" class=\"data row119 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " <td id=\"T_0502a_row119_col1\" class=\"data row119 col1\" >K Means Clusters Optimization</td>\n", - " <td id=\"T_0502a_row119_col2\" class=\"data row119 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", - " <td id=\"T_0502a_row119_col3\" class=\"data row119 col3\" >True</td>\n", - " <td id=\"T_0502a_row119_col4\" class=\"data row119 col4\" >False</td>\n", - " <td id=\"T_0502a_row119_col5\" class=\"data row119 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row119_col6\" class=\"data row119 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row119_col7\" class=\"data row119 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", - " <td id=\"T_0502a_row119_col8\" class=\"data row119 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row120_col0\" class=\"data row120 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_0502a_row120_col1\" class=\"data row120 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_0502a_row120_col2\" class=\"data row120 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_0502a_row120_col3\" class=\"data row120 col3\" >False</td>\n", - " <td id=\"T_0502a_row120_col4\" class=\"data row120 col4\" >True</td>\n", - " <td id=\"T_0502a_row120_col5\" class=\"data row120 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row120_col6\" class=\"data row120 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row120_col7\" class=\"data row120 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row120_col8\" class=\"data row120 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row121_col0\" class=\"data row121 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_0502a_row121_col1\" class=\"data row121 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_0502a_row121_col2\" class=\"data row121 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_0502a_row121_col3\" class=\"data row121 col3\" >False</td>\n", - " <td id=\"T_0502a_row121_col4\" class=\"data row121 col4\" >True</td>\n", - " <td id=\"T_0502a_row121_col5\" class=\"data row121 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row121_col6\" class=\"data row121 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row121_col7\" class=\"data row121 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row121_col8\" class=\"data row121 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row122_col0\" class=\"data row122 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_0502a_row122_col1\" class=\"data row122 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_0502a_row122_col2\" class=\"data row122 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_0502a_row122_col3\" class=\"data row122 col3\" >False</td>\n", - " <td id=\"T_0502a_row122_col4\" class=\"data row122 col4\" >True</td>\n", - " <td id=\"T_0502a_row122_col5\" class=\"data row122 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row122_col6\" class=\"data row122 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row122_col7\" class=\"data row122 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row122_col8\" class=\"data row122 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row123_col0\" class=\"data row123 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_0502a_row123_col1\" class=\"data row123 col1\" >Model Parameters</td>\n", - " <td id=\"T_0502a_row123_col2\" class=\"data row123 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_0502a_row123_col3\" class=\"data row123 col3\" >False</td>\n", - " <td id=\"T_0502a_row123_col4\" class=\"data row123 col4\" >True</td>\n", - " <td id=\"T_0502a_row123_col5\" class=\"data row123 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row123_col6\" class=\"data row123 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row123_col7\" class=\"data row123 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_0502a_row123_col8\" class=\"data row123 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row124_col0\" class=\"data row124 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_0502a_row124_col1\" class=\"data row124 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_0502a_row124_col2\" class=\"data row124 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_0502a_row124_col3\" class=\"data row124 col3\" >False</td>\n", - " <td id=\"T_0502a_row124_col4\" class=\"data row124 col4\" >True</td>\n", - " <td id=\"T_0502a_row124_col5\" class=\"data row124 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row124_col6\" class=\"data row124 col6\" >{}</td>\n", - " <td id=\"T_0502a_row124_col7\" class=\"data row124 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_0502a_row124_col8\" class=\"data row124 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row125_col0\" class=\"data row125 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_0502a_row125_col1\" class=\"data row125 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_0502a_row125_col2\" class=\"data row125 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_0502a_row125_col3\" class=\"data row125 col3\" >True</td>\n", - " <td id=\"T_0502a_row125_col4\" class=\"data row125 col4\" >True</td>\n", - " <td id=\"T_0502a_row125_col5\" class=\"data row125 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_0502a_row125_col6\" class=\"data row125 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_0502a_row125_col7\" class=\"data row125 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_0502a_row125_col8\" class=\"data row125 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row126_col0\" class=\"data row126 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_0502a_row126_col1\" class=\"data row126 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_0502a_row126_col2\" class=\"data row126 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_0502a_row126_col3\" class=\"data row126 col3\" >True</td>\n", - " <td id=\"T_0502a_row126_col4\" class=\"data row126 col4\" >False</td>\n", - " <td id=\"T_0502a_row126_col5\" class=\"data row126 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row126_col6\" class=\"data row126 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row126_col7\" class=\"data row126 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row126_col8\" class=\"data row126 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row127_col0\" class=\"data row127 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_0502a_row127_col1\" class=\"data row127 col1\" >Population Stability Index</td>\n", - " <td id=\"T_0502a_row127_col2\" class=\"data row127 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_0502a_row127_col3\" class=\"data row127 col3\" >True</td>\n", - " <td id=\"T_0502a_row127_col4\" class=\"data row127 col4\" >True</td>\n", - " <td id=\"T_0502a_row127_col5\" class=\"data row127 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row127_col6\" class=\"data row127 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_0502a_row127_col7\" class=\"data row127 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row127_col8\" class=\"data row127 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row128_col0\" class=\"data row128 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_0502a_row128_col1\" class=\"data row128 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_0502a_row128_col2\" class=\"data row128 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_0502a_row128_col3\" class=\"data row128 col3\" >True</td>\n", - " <td id=\"T_0502a_row128_col4\" class=\"data row128 col4\" >False</td>\n", - " <td id=\"T_0502a_row128_col5\" class=\"data row128 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row128_col6\" class=\"data row128 col6\" >{}</td>\n", - " <td id=\"T_0502a_row128_col7\" class=\"data row128 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row128_col8\" class=\"data row128 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row129_col0\" class=\"data row129 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_0502a_row129_col1\" class=\"data row129 col1\" >ROC Curve</td>\n", - " <td id=\"T_0502a_row129_col2\" class=\"data row129 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_0502a_row129_col3\" class=\"data row129 col3\" >True</td>\n", - " <td id=\"T_0502a_row129_col4\" class=\"data row129 col4\" >False</td>\n", - " <td id=\"T_0502a_row129_col5\" class=\"data row129 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row129_col6\" class=\"data row129 col6\" >{}</td>\n", - " <td id=\"T_0502a_row129_col7\" class=\"data row129 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row129_col8\" class=\"data row129 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row130_col0\" class=\"data row130 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_0502a_row130_col1\" class=\"data row130 col1\" >Regression Errors</td>\n", - " <td id=\"T_0502a_row130_col2\" class=\"data row130 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_0502a_row130_col3\" class=\"data row130 col3\" >False</td>\n", - " <td id=\"T_0502a_row130_col4\" class=\"data row130 col4\" >True</td>\n", - " <td id=\"T_0502a_row130_col5\" class=\"data row130 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row130_col6\" class=\"data row130 col6\" >{}</td>\n", - " <td id=\"T_0502a_row130_col7\" class=\"data row130 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row130_col8\" class=\"data row130 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row131_col0\" class=\"data row131 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", - " <td id=\"T_0502a_row131_col1\" class=\"data row131 col1\" >Regression Errors Comparison</td>\n", - " <td id=\"T_0502a_row131_col2\" class=\"data row131 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", - " <td id=\"T_0502a_row131_col3\" class=\"data row131 col3\" >False</td>\n", - " <td id=\"T_0502a_row131_col4\" class=\"data row131 col4\" >True</td>\n", - " <td id=\"T_0502a_row131_col5\" class=\"data row131 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_0502a_row131_col6\" class=\"data row131 col6\" >{}</td>\n", - " <td id=\"T_0502a_row131_col7\" class=\"data row131 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row131_col8\" class=\"data row131 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row132_col0\" class=\"data row132 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", - " <td id=\"T_0502a_row132_col1\" class=\"data row132 col1\" >Regression Performance</td>\n", - " <td id=\"T_0502a_row132_col2\" class=\"data row132 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", - " <td id=\"T_0502a_row132_col3\" class=\"data row132 col3\" >False</td>\n", - " <td id=\"T_0502a_row132_col4\" class=\"data row132 col4\" >True</td>\n", - " <td id=\"T_0502a_row132_col5\" class=\"data row132 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row132_col6\" class=\"data row132 col6\" >{}</td>\n", - " <td id=\"T_0502a_row132_col7\" class=\"data row132 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row132_col8\" class=\"data row132 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row133_col0\" class=\"data row133 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " <td id=\"T_0502a_row133_col1\" class=\"data row133 col1\" >Regression R2 Square</td>\n", - " <td id=\"T_0502a_row133_col2\" class=\"data row133 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", - " <td id=\"T_0502a_row133_col3\" class=\"data row133 col3\" >False</td>\n", - " <td id=\"T_0502a_row133_col4\" class=\"data row133 col4\" >True</td>\n", - " <td id=\"T_0502a_row133_col5\" class=\"data row133 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row133_col6\" class=\"data row133 col6\" >{}</td>\n", - " <td id=\"T_0502a_row133_col7\" class=\"data row133 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row133_col8\" class=\"data row133 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row134_col0\" class=\"data row134 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", - " <td id=\"T_0502a_row134_col1\" class=\"data row134 col1\" >Regression R2 Square Comparison</td>\n", - " <td id=\"T_0502a_row134_col2\" class=\"data row134 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", - " <td id=\"T_0502a_row134_col3\" class=\"data row134 col3\" >False</td>\n", - " <td id=\"T_0502a_row134_col4\" class=\"data row134 col4\" >True</td>\n", - " <td id=\"T_0502a_row134_col5\" class=\"data row134 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_0502a_row134_col6\" class=\"data row134 col6\" >{}</td>\n", - " <td id=\"T_0502a_row134_col7\" class=\"data row134 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row134_col8\" class=\"data row134 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row135_col0\" class=\"data row135 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_0502a_row135_col1\" class=\"data row135 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_0502a_row135_col2\" class=\"data row135 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_0502a_row135_col3\" class=\"data row135 col3\" >True</td>\n", - " <td id=\"T_0502a_row135_col4\" class=\"data row135 col4\" >True</td>\n", - " <td id=\"T_0502a_row135_col5\" class=\"data row135 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row135_col6\" class=\"data row135 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row135_col7\" class=\"data row135 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_0502a_row135_col8\" class=\"data row135 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row136_col0\" class=\"data row136 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_0502a_row136_col1\" class=\"data row136 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_0502a_row136_col2\" class=\"data row136 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_0502a_row136_col3\" class=\"data row136 col3\" >False</td>\n", - " <td id=\"T_0502a_row136_col4\" class=\"data row136 col4\" >True</td>\n", - " <td id=\"T_0502a_row136_col5\" class=\"data row136 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row136_col6\" class=\"data row136 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row136_col7\" class=\"data row136 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row136_col8\" class=\"data row136 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row137_col0\" class=\"data row137 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_0502a_row137_col1\" class=\"data row137 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_0502a_row137_col2\" class=\"data row137 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_0502a_row137_col3\" class=\"data row137 col3\" >True</td>\n", - " <td id=\"T_0502a_row137_col4\" class=\"data row137 col4\" >True</td>\n", - " <td id=\"T_0502a_row137_col5\" class=\"data row137 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row137_col6\" class=\"data row137 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row137_col7\" class=\"data row137 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_0502a_row137_col8\" class=\"data row137 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row138_col0\" class=\"data row138 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " <td id=\"T_0502a_row138_col1\" class=\"data row138 col1\" >Silhouette Plot</td>\n", - " <td id=\"T_0502a_row138_col2\" class=\"data row138 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", - " <td id=\"T_0502a_row138_col3\" class=\"data row138 col3\" >True</td>\n", - " <td id=\"T_0502a_row138_col4\" class=\"data row138 col4\" >True</td>\n", - " <td id=\"T_0502a_row138_col5\" class=\"data row138 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row138_col6\" class=\"data row138 col6\" >{}</td>\n", - " <td id=\"T_0502a_row138_col7\" class=\"data row138 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row138_col8\" class=\"data row138 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row139_col0\" class=\"data row139 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_0502a_row139_col1\" class=\"data row139 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_0502a_row139_col2\" class=\"data row139 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_0502a_row139_col3\" class=\"data row139 col3\" >False</td>\n", - " <td id=\"T_0502a_row139_col4\" class=\"data row139 col4\" >True</td>\n", - " <td id=\"T_0502a_row139_col5\" class=\"data row139 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row139_col6\" class=\"data row139 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_0502a_row139_col7\" class=\"data row139 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row139_col8\" class=\"data row139 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row140_col0\" class=\"data row140 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", - " <td id=\"T_0502a_row140_col1\" class=\"data row140 col1\" >V Measure</td>\n", - " <td id=\"T_0502a_row140_col2\" class=\"data row140 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", - " <td id=\"T_0502a_row140_col3\" class=\"data row140 col3\" >False</td>\n", - " <td id=\"T_0502a_row140_col4\" class=\"data row140 col4\" >True</td>\n", - " <td id=\"T_0502a_row140_col5\" class=\"data row140 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row140_col6\" class=\"data row140 col6\" >{}</td>\n", - " <td id=\"T_0502a_row140_col7\" class=\"data row140 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row140_col8\" class=\"data row140 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row141_col0\" class=\"data row141 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_0502a_row141_col1\" class=\"data row141 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_0502a_row141_col2\" class=\"data row141 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_0502a_row141_col3\" class=\"data row141 col3\" >True</td>\n", - " <td id=\"T_0502a_row141_col4\" class=\"data row141 col4\" >True</td>\n", - " <td id=\"T_0502a_row141_col5\" class=\"data row141 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row141_col6\" class=\"data row141 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row141_col7\" class=\"data row141 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_0502a_row141_col8\" class=\"data row141 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row142_col0\" class=\"data row142 col0\" >validmind.model_validation.statsmodels.AutoARIMA</td>\n", - " <td id=\"T_0502a_row142_col1\" class=\"data row142 col1\" >Auto ARIMA</td>\n", - " <td id=\"T_0502a_row142_col2\" class=\"data row142 col2\" >Evaluates ARIMA models for time-series forecasting, ranking them using Bayesian and Akaike Information Criteria....</td>\n", - " <td id=\"T_0502a_row142_col3\" class=\"data row142 col3\" >False</td>\n", - " <td id=\"T_0502a_row142_col4\" class=\"data row142 col4\" >True</td>\n", - " <td id=\"T_0502a_row142_col5\" class=\"data row142 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row142_col6\" class=\"data row142 col6\" >{}</td>\n", - " <td id=\"T_0502a_row142_col7\" class=\"data row142 col7\" >['time_series_data', 'forecasting', 'model_selection', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row142_col8\" class=\"data row142 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row143_col0\" class=\"data row143 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", - " <td id=\"T_0502a_row143_col1\" class=\"data row143 col1\" >Cumulative Prediction Probabilities</td>\n", - " <td id=\"T_0502a_row143_col2\" class=\"data row143 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", - " <td id=\"T_0502a_row143_col3\" class=\"data row143 col3\" >True</td>\n", - " <td id=\"T_0502a_row143_col4\" class=\"data row143 col4\" >False</td>\n", - " <td id=\"T_0502a_row143_col5\" class=\"data row143 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row143_col6\" class=\"data row143 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", - " <td id=\"T_0502a_row143_col7\" class=\"data row143 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row143_col8\" class=\"data row143 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row144_col0\" class=\"data row144 col0\" >validmind.model_validation.statsmodels.DurbinWatsonTest</td>\n", - " <td id=\"T_0502a_row144_col1\" class=\"data row144 col1\" >Durbin Watson Test</td>\n", - " <td id=\"T_0502a_row144_col2\" class=\"data row144 col2\" >Assesses autocorrelation in time series data features using the Durbin-Watson statistic....</td>\n", - " <td id=\"T_0502a_row144_col3\" class=\"data row144 col3\" >False</td>\n", - " <td id=\"T_0502a_row144_col4\" class=\"data row144 col4\" >True</td>\n", - " <td id=\"T_0502a_row144_col5\" class=\"data row144 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row144_col6\" class=\"data row144 col6\" >{'threshold': {'type': None, 'default': [1.5, 2.5]}}</td>\n", - " <td id=\"T_0502a_row144_col7\" class=\"data row144 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row144_col8\" class=\"data row144 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row145_col0\" class=\"data row145 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", - " <td id=\"T_0502a_row145_col1\" class=\"data row145 col1\" >GINI Table</td>\n", - " <td id=\"T_0502a_row145_col2\" class=\"data row145 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", - " <td id=\"T_0502a_row145_col3\" class=\"data row145 col3\" >False</td>\n", - " <td id=\"T_0502a_row145_col4\" class=\"data row145 col4\" >True</td>\n", - " <td id=\"T_0502a_row145_col5\" class=\"data row145 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row145_col6\" class=\"data row145 col6\" >{}</td>\n", - " <td id=\"T_0502a_row145_col7\" class=\"data row145 col7\" >['model_performance']</td>\n", - " <td id=\"T_0502a_row145_col8\" class=\"data row145 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row146_col0\" class=\"data row146 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", - " <td id=\"T_0502a_row146_col1\" class=\"data row146 col1\" >Kolmogorov Smirnov</td>\n", - " <td id=\"T_0502a_row146_col2\" class=\"data row146 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", - " <td id=\"T_0502a_row146_col3\" class=\"data row146 col3\" >False</td>\n", - " <td id=\"T_0502a_row146_col4\" class=\"data row146 col4\" >True</td>\n", - " <td id=\"T_0502a_row146_col5\" class=\"data row146 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row146_col6\" class=\"data row146 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", - " <td id=\"T_0502a_row146_col7\" class=\"data row146 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row146_col8\" class=\"data row146 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row147_col0\" class=\"data row147 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", - " <td id=\"T_0502a_row147_col1\" class=\"data row147 col1\" >Lilliefors</td>\n", - " <td id=\"T_0502a_row147_col2\" class=\"data row147 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", - " <td id=\"T_0502a_row147_col3\" class=\"data row147 col3\" >False</td>\n", - " <td id=\"T_0502a_row147_col4\" class=\"data row147 col4\" >True</td>\n", - " <td id=\"T_0502a_row147_col5\" class=\"data row147 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row147_col6\" class=\"data row147 col6\" >{}</td>\n", - " <td id=\"T_0502a_row147_col7\" class=\"data row147 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row147_col8\" class=\"data row147 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row148_col0\" class=\"data row148 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", - " <td id=\"T_0502a_row148_col1\" class=\"data row148 col1\" >Prediction Probabilities Histogram</td>\n", - " <td id=\"T_0502a_row148_col2\" class=\"data row148 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", - " <td id=\"T_0502a_row148_col3\" class=\"data row148 col3\" >True</td>\n", - " <td id=\"T_0502a_row148_col4\" class=\"data row148 col4\" >False</td>\n", - " <td id=\"T_0502a_row148_col5\" class=\"data row148 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row148_col6\" class=\"data row148 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", - " <td id=\"T_0502a_row148_col7\" class=\"data row148 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row148_col8\" class=\"data row148 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row149_col0\" class=\"data row149 col0\" >validmind.model_validation.statsmodels.RegressionCoeffs</td>\n", - " <td id=\"T_0502a_row149_col1\" class=\"data row149 col1\" >Regression Coeffs</td>\n", - " <td id=\"T_0502a_row149_col2\" class=\"data row149 col2\" >Assesses the significance and uncertainty of predictor variables in a regression model through visualization of...</td>\n", - " <td id=\"T_0502a_row149_col3\" class=\"data row149 col3\" >True</td>\n", - " <td id=\"T_0502a_row149_col4\" class=\"data row149 col4\" >True</td>\n", - " <td id=\"T_0502a_row149_col5\" class=\"data row149 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row149_col6\" class=\"data row149 col6\" >{}</td>\n", - " <td id=\"T_0502a_row149_col7\" class=\"data row149 col7\" >['tabular_data', 'visualization', 'model_training']</td>\n", - " <td id=\"T_0502a_row149_col8\" class=\"data row149 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row150_col0\" class=\"data row150 col0\" >validmind.model_validation.statsmodels.RegressionFeatureSignificance</td>\n", - " <td id=\"T_0502a_row150_col1\" class=\"data row150 col1\" >Regression Feature Significance</td>\n", - " <td id=\"T_0502a_row150_col2\" class=\"data row150 col2\" >Assesses and visualizes the statistical significance of features in a regression model....</td>\n", - " <td id=\"T_0502a_row150_col3\" class=\"data row150 col3\" >True</td>\n", - " <td id=\"T_0502a_row150_col4\" class=\"data row150 col4\" >False</td>\n", - " <td id=\"T_0502a_row150_col5\" class=\"data row150 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row150_col6\" class=\"data row150 col6\" >{'fontsize': {'type': 'int', 'default': 10}, 'p_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row150_col7\" class=\"data row150 col7\" >['statistical_test', 'model_interpretation', 'visualization', 'feature_importance']</td>\n", - " <td id=\"T_0502a_row150_col8\" class=\"data row150 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row151_col0\" class=\"data row151 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlot</td>\n", - " <td id=\"T_0502a_row151_col1\" class=\"data row151 col1\" >Regression Model Forecast Plot</td>\n", - " <td id=\"T_0502a_row151_col2\" class=\"data row151 col2\" >Generates plots to visually compare the forecasted outcomes of a regression model against actual observed values over...</td>\n", - " <td id=\"T_0502a_row151_col3\" class=\"data row151 col3\" >True</td>\n", - " <td id=\"T_0502a_row151_col4\" class=\"data row151 col4\" >False</td>\n", - " <td id=\"T_0502a_row151_col5\" class=\"data row151 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row151_col6\" class=\"data row151 col6\" >{'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row151_col7\" class=\"data row151 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", - " <td id=\"T_0502a_row151_col8\" class=\"data row151 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row152_col0\" class=\"data row152 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels</td>\n", - " <td id=\"T_0502a_row152_col1\" class=\"data row152 col1\" >Regression Model Forecast Plot Levels</td>\n", - " <td id=\"T_0502a_row152_col2\" class=\"data row152 col2\" >Assesses the alignment between forecasted and observed values in regression models through visual plots...</td>\n", - " <td id=\"T_0502a_row152_col3\" class=\"data row152 col3\" >True</td>\n", - " <td id=\"T_0502a_row152_col4\" class=\"data row152 col4\" >False</td>\n", - " <td id=\"T_0502a_row152_col5\" class=\"data row152 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row152_col6\" class=\"data row152 col6\" >{}</td>\n", - " <td id=\"T_0502a_row152_col7\" class=\"data row152 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", - " <td id=\"T_0502a_row152_col8\" class=\"data row152 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row153_col0\" class=\"data row153 col0\" >validmind.model_validation.statsmodels.RegressionModelSensitivityPlot</td>\n", - " <td id=\"T_0502a_row153_col1\" class=\"data row153 col1\" >Regression Model Sensitivity Plot</td>\n", - " <td id=\"T_0502a_row153_col2\" class=\"data row153 col2\" >Assesses the sensitivity of a regression model to changes in independent variables by applying shocks and...</td>\n", - " <td id=\"T_0502a_row153_col3\" class=\"data row153 col3\" >True</td>\n", - " <td id=\"T_0502a_row153_col4\" class=\"data row153 col4\" >False</td>\n", - " <td id=\"T_0502a_row153_col5\" class=\"data row153 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row153_col6\" class=\"data row153 col6\" >{'shocks': {'type': None, 'default': [0.1]}, 'transformation': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row153_col7\" class=\"data row153 col7\" >['senstivity_analysis', 'visualization']</td>\n", - " <td id=\"T_0502a_row153_col8\" class=\"data row153 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row154_col0\" class=\"data row154 col0\" >validmind.model_validation.statsmodels.RegressionModelSummary</td>\n", - " <td id=\"T_0502a_row154_col1\" class=\"data row154 col1\" >Regression Model Summary</td>\n", - " <td id=\"T_0502a_row154_col2\" class=\"data row154 col2\" >Evaluates regression model performance using metrics including R-Squared, Adjusted R-Squared, MSE, and RMSE....</td>\n", - " <td id=\"T_0502a_row154_col3\" class=\"data row154 col3\" >False</td>\n", - " <td id=\"T_0502a_row154_col4\" class=\"data row154 col4\" >True</td>\n", - " <td id=\"T_0502a_row154_col5\" class=\"data row154 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row154_col6\" class=\"data row154 col6\" >{}</td>\n", - " <td id=\"T_0502a_row154_col7\" class=\"data row154 col7\" >['model_performance', 'regression']</td>\n", - " <td id=\"T_0502a_row154_col8\" class=\"data row154 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row155_col0\" class=\"data row155 col0\" >validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance</td>\n", - " <td id=\"T_0502a_row155_col1\" class=\"data row155 col1\" >Regression Permutation Feature Importance</td>\n", - " <td id=\"T_0502a_row155_col2\" class=\"data row155 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_0502a_row155_col3\" class=\"data row155 col3\" >True</td>\n", - " <td id=\"T_0502a_row155_col4\" class=\"data row155 col4\" >False</td>\n", - " <td id=\"T_0502a_row155_col5\" class=\"data row155 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row155_col6\" class=\"data row155 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row155_col7\" class=\"data row155 col7\" >['statsmodels', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row155_col8\" class=\"data row155 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row156_col0\" class=\"data row156 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", - " <td id=\"T_0502a_row156_col1\" class=\"data row156 col1\" >Scorecard Histogram</td>\n", - " <td id=\"T_0502a_row156_col2\" class=\"data row156 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", - " <td id=\"T_0502a_row156_col3\" class=\"data row156 col3\" >True</td>\n", - " <td id=\"T_0502a_row156_col4\" class=\"data row156 col4\" >False</td>\n", - " <td id=\"T_0502a_row156_col5\" class=\"data row156 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row156_col6\" class=\"data row156 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", - " <td id=\"T_0502a_row156_col7\" class=\"data row156 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_0502a_row156_col8\" class=\"data row156 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row157_col0\" class=\"data row157 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_0502a_row157_col1\" class=\"data row157 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_0502a_row157_col2\" class=\"data row157 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row157_col3\" class=\"data row157 col3\" >True</td>\n", - " <td id=\"T_0502a_row157_col4\" class=\"data row157 col4\" >True</td>\n", - " <td id=\"T_0502a_row157_col5\" class=\"data row157 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row157_col6\" class=\"data row157 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row157_col7\" class=\"data row157 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row157_col8\" class=\"data row157 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row158_col0\" class=\"data row158 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_0502a_row158_col1\" class=\"data row158 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_0502a_row158_col2\" class=\"data row158 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row158_col3\" class=\"data row158 col3\" >False</td>\n", - " <td id=\"T_0502a_row158_col4\" class=\"data row158 col4\" >True</td>\n", - " <td id=\"T_0502a_row158_col5\" class=\"data row158 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row158_col6\" class=\"data row158 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row158_col7\" class=\"data row158 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row158_col8\" class=\"data row158 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row159_col0\" class=\"data row159 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", - " <td id=\"T_0502a_row159_col1\" class=\"data row159 col1\" >Class Imbalance Drift</td>\n", - " <td id=\"T_0502a_row159_col2\" class=\"data row159 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row159_col3\" class=\"data row159 col3\" >True</td>\n", - " <td id=\"T_0502a_row159_col4\" class=\"data row159 col4\" >True</td>\n", - " <td id=\"T_0502a_row159_col5\" class=\"data row159 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row159_col6\" class=\"data row159 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", - " <td id=\"T_0502a_row159_col7\" class=\"data row159 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", - " <td id=\"T_0502a_row159_col8\" class=\"data row159 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row160_col0\" class=\"data row160 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_0502a_row160_col1\" class=\"data row160 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_0502a_row160_col2\" class=\"data row160 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row160_col3\" class=\"data row160 col3\" >False</td>\n", - " <td id=\"T_0502a_row160_col4\" class=\"data row160 col4\" >True</td>\n", - " <td id=\"T_0502a_row160_col5\" class=\"data row160 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row160_col6\" class=\"data row160 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row160_col7\" class=\"data row160 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row160_col8\" class=\"data row160 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row161_col0\" class=\"data row161 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_0502a_row161_col1\" class=\"data row161 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_0502a_row161_col2\" class=\"data row161 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row161_col3\" class=\"data row161 col3\" >False</td>\n", - " <td id=\"T_0502a_row161_col4\" class=\"data row161 col4\" >True</td>\n", - " <td id=\"T_0502a_row161_col5\" class=\"data row161 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row161_col6\" class=\"data row161 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row161_col7\" class=\"data row161 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row161_col8\" class=\"data row161 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row162_col0\" class=\"data row162 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", - " <td id=\"T_0502a_row162_col1\" class=\"data row162 col1\" >Cumulative Prediction Probabilities Drift</td>\n", - " <td id=\"T_0502a_row162_col2\" class=\"data row162 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row162_col3\" class=\"data row162 col3\" >True</td>\n", - " <td id=\"T_0502a_row162_col4\" class=\"data row162 col4\" >False</td>\n", - " <td id=\"T_0502a_row162_col5\" class=\"data row162 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row162_col6\" class=\"data row162 col6\" >{}</td>\n", - " <td id=\"T_0502a_row162_col7\" class=\"data row162 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row162_col8\" class=\"data row162 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row163_col0\" class=\"data row163 col0\" >validmind.ongoing_monitoring.FeatureDrift</td>\n", - " <td id=\"T_0502a_row163_col1\" class=\"data row163 col1\" >Feature Drift</td>\n", - " <td id=\"T_0502a_row163_col2\" class=\"data row163 col2\" >Evaluates changes in feature distribution over time to identify potential model drift....</td>\n", - " <td id=\"T_0502a_row163_col3\" class=\"data row163 col3\" >True</td>\n", - " <td id=\"T_0502a_row163_col4\" class=\"data row163 col4\" >True</td>\n", - " <td id=\"T_0502a_row163_col5\" class=\"data row163 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row163_col6\" class=\"data row163 col6\" >{'bins': {'type': '_empty', 'default': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]}, 'feature_columns': {'type': '_empty', 'default': None}, 'psi_threshold': {'type': '_empty', 'default': 0.2}}</td>\n", - " <td id=\"T_0502a_row163_col7\" class=\"data row163 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row163_col8\" class=\"data row163 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row164_col0\" class=\"data row164 col0\" >validmind.ongoing_monitoring.PredictionAcrossEachFeature</td>\n", - " <td id=\"T_0502a_row164_col1\" class=\"data row164 col1\" >Prediction Across Each Feature</td>\n", - " <td id=\"T_0502a_row164_col2\" class=\"data row164 col2\" >Assesses differences in model predictions across individual features between reference and monitoring datasets...</td>\n", - " <td id=\"T_0502a_row164_col3\" class=\"data row164 col3\" >True</td>\n", - " <td id=\"T_0502a_row164_col4\" class=\"data row164 col4\" >False</td>\n", - " <td id=\"T_0502a_row164_col5\" class=\"data row164 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row164_col6\" class=\"data row164 col6\" >{}</td>\n", - " <td id=\"T_0502a_row164_col7\" class=\"data row164 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row164_col8\" class=\"data row164 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row165_col0\" class=\"data row165 col0\" >validmind.ongoing_monitoring.PredictionCorrelation</td>\n", - " <td id=\"T_0502a_row165_col1\" class=\"data row165 col1\" >Prediction Correlation</td>\n", - " <td id=\"T_0502a_row165_col2\" class=\"data row165 col2\" >Assesses correlation changes between model predictions from reference and monitoring datasets to detect potential...</td>\n", - " <td id=\"T_0502a_row165_col3\" class=\"data row165 col3\" >True</td>\n", - " <td id=\"T_0502a_row165_col4\" class=\"data row165 col4\" >True</td>\n", - " <td id=\"T_0502a_row165_col5\" class=\"data row165 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row165_col6\" class=\"data row165 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row165_col7\" class=\"data row165 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row165_col8\" class=\"data row165 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row166_col0\" class=\"data row166 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", - " <td id=\"T_0502a_row166_col1\" class=\"data row166 col1\" >Prediction Probabilities Histogram Drift</td>\n", - " <td id=\"T_0502a_row166_col2\" class=\"data row166 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row166_col3\" class=\"data row166 col3\" >True</td>\n", - " <td id=\"T_0502a_row166_col4\" class=\"data row166 col4\" >True</td>\n", - " <td id=\"T_0502a_row166_col5\" class=\"data row166 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row166_col6\" class=\"data row166 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row166_col7\" class=\"data row166 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row166_col8\" class=\"data row166 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row167_col0\" class=\"data row167 col0\" >validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures</td>\n", - " <td id=\"T_0502a_row167_col1\" class=\"data row167 col1\" >Prediction Quantiles Across Features</td>\n", - " <td id=\"T_0502a_row167_col2\" class=\"data row167 col2\" >Assesses differences in model prediction distributions across individual features between reference...</td>\n", - " <td id=\"T_0502a_row167_col3\" class=\"data row167 col3\" >True</td>\n", - " <td id=\"T_0502a_row167_col4\" class=\"data row167 col4\" >False</td>\n", - " <td id=\"T_0502a_row167_col5\" class=\"data row167 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row167_col6\" class=\"data row167 col6\" >{}</td>\n", - " <td id=\"T_0502a_row167_col7\" class=\"data row167 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row167_col8\" class=\"data row167 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row168_col0\" class=\"data row168 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_0502a_row168_col1\" class=\"data row168 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_0502a_row168_col2\" class=\"data row168 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row168_col3\" class=\"data row168 col3\" >True</td>\n", - " <td id=\"T_0502a_row168_col4\" class=\"data row168 col4\" >False</td>\n", - " <td id=\"T_0502a_row168_col5\" class=\"data row168 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row168_col6\" class=\"data row168 col6\" >{}</td>\n", - " <td id=\"T_0502a_row168_col7\" class=\"data row168 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row168_col8\" class=\"data row168 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row169_col0\" class=\"data row169 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", - " <td id=\"T_0502a_row169_col1\" class=\"data row169 col1\" >Score Bands Drift</td>\n", - " <td id=\"T_0502a_row169_col2\" class=\"data row169 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", - " <td id=\"T_0502a_row169_col3\" class=\"data row169 col3\" >False</td>\n", - " <td id=\"T_0502a_row169_col4\" class=\"data row169 col4\" >True</td>\n", - " <td id=\"T_0502a_row169_col5\" class=\"data row169 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row169_col6\" class=\"data row169 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row169_col7\" class=\"data row169 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_0502a_row169_col8\" class=\"data row169 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row170_col0\" class=\"data row170 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", - " <td id=\"T_0502a_row170_col1\" class=\"data row170 col1\" >Scorecard Histogram Drift</td>\n", - " <td id=\"T_0502a_row170_col2\" class=\"data row170 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", - " <td id=\"T_0502a_row170_col3\" class=\"data row170 col3\" >True</td>\n", - " <td id=\"T_0502a_row170_col4\" class=\"data row170 col4\" >True</td>\n", - " <td id=\"T_0502a_row170_col5\" class=\"data row170 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row170_col6\" class=\"data row170 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row170_col7\" class=\"data row170 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_0502a_row170_col8\" class=\"data row170 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row171_col0\" class=\"data row171 col0\" >validmind.ongoing_monitoring.TargetPredictionDistributionPlot</td>\n", - " <td id=\"T_0502a_row171_col1\" class=\"data row171 col1\" >Target Prediction Distribution Plot</td>\n", - " <td id=\"T_0502a_row171_col2\" class=\"data row171 col2\" >Assesses differences in prediction distributions between a reference dataset and a monitoring dataset to identify...</td>\n", - " <td id=\"T_0502a_row171_col3\" class=\"data row171 col3\" >True</td>\n", - " <td id=\"T_0502a_row171_col4\" class=\"data row171 col4\" >True</td>\n", - " <td id=\"T_0502a_row171_col5\" class=\"data row171 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row171_col6\" class=\"data row171 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row171_col7\" class=\"data row171 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row171_col8\" class=\"data row171 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row172_col0\" class=\"data row172 col0\" >validmind.prompt_validation.Bias</td>\n", - " <td id=\"T_0502a_row172_col1\" class=\"data row172 col1\" >Bias</td>\n", - " <td id=\"T_0502a_row172_col2\" class=\"data row172 col2\" >Assesses potential bias in a Large Language Model by analyzing the distribution and order of exemplars in the...</td>\n", - " <td id=\"T_0502a_row172_col3\" class=\"data row172 col3\" >False</td>\n", - " <td id=\"T_0502a_row172_col4\" class=\"data row172 col4\" >True</td>\n", - " <td id=\"T_0502a_row172_col5\" class=\"data row172 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row172_col6\" class=\"data row172 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row172_col7\" class=\"data row172 col7\" >['llm', 'few_shot']</td>\n", - " <td id=\"T_0502a_row172_col8\" class=\"data row172 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row173_col0\" class=\"data row173 col0\" >validmind.prompt_validation.Clarity</td>\n", - " <td id=\"T_0502a_row173_col1\" class=\"data row173 col1\" >Clarity</td>\n", - " <td id=\"T_0502a_row173_col2\" class=\"data row173 col2\" >Evaluates and scores the clarity of prompts in a Large Language Model based on specified guidelines....</td>\n", - " <td id=\"T_0502a_row173_col3\" class=\"data row173 col3\" >False</td>\n", - " <td id=\"T_0502a_row173_col4\" class=\"data row173 col4\" >True</td>\n", - " <td id=\"T_0502a_row173_col5\" class=\"data row173 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row173_col6\" class=\"data row173 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row173_col7\" class=\"data row173 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row173_col8\" class=\"data row173 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row174_col0\" class=\"data row174 col0\" >validmind.prompt_validation.Conciseness</td>\n", - " <td id=\"T_0502a_row174_col1\" class=\"data row174 col1\" >Conciseness</td>\n", - " <td id=\"T_0502a_row174_col2\" class=\"data row174 col2\" >Analyzes and grades the conciseness of prompts provided to a Large Language Model....</td>\n", - " <td id=\"T_0502a_row174_col3\" class=\"data row174 col3\" >False</td>\n", - " <td id=\"T_0502a_row174_col4\" class=\"data row174 col4\" >True</td>\n", - " <td id=\"T_0502a_row174_col5\" class=\"data row174 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row174_col6\" class=\"data row174 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row174_col7\" class=\"data row174 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row174_col8\" class=\"data row174 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row175_col0\" class=\"data row175 col0\" >validmind.prompt_validation.Delimitation</td>\n", - " <td id=\"T_0502a_row175_col1\" class=\"data row175 col1\" >Delimitation</td>\n", - " <td id=\"T_0502a_row175_col2\" class=\"data row175 col2\" >Evaluates the proper use of delimiters in prompts provided to Large Language Models....</td>\n", - " <td id=\"T_0502a_row175_col3\" class=\"data row175 col3\" >False</td>\n", - " <td id=\"T_0502a_row175_col4\" class=\"data row175 col4\" >True</td>\n", - " <td id=\"T_0502a_row175_col5\" class=\"data row175 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row175_col6\" class=\"data row175 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row175_col7\" class=\"data row175 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row175_col8\" class=\"data row175 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row176_col0\" class=\"data row176 col0\" >validmind.prompt_validation.NegativeInstruction</td>\n", - " <td id=\"T_0502a_row176_col1\" class=\"data row176 col1\" >Negative Instruction</td>\n", - " <td id=\"T_0502a_row176_col2\" class=\"data row176 col2\" >Evaluates and grades the use of affirmative, proactive language over negative instructions in LLM prompts....</td>\n", - " <td id=\"T_0502a_row176_col3\" class=\"data row176 col3\" >False</td>\n", - " <td id=\"T_0502a_row176_col4\" class=\"data row176 col4\" >True</td>\n", - " <td id=\"T_0502a_row176_col5\" class=\"data row176 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row176_col6\" class=\"data row176 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row176_col7\" class=\"data row176 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row176_col8\" class=\"data row176 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row177_col0\" class=\"data row177 col0\" >validmind.prompt_validation.Robustness</td>\n", - " <td id=\"T_0502a_row177_col1\" class=\"data row177 col1\" >Robustness</td>\n", - " <td id=\"T_0502a_row177_col2\" class=\"data row177 col2\" >Assesses the robustness of prompts provided to a Large Language Model under varying conditions and contexts. This test...</td>\n", - " <td id=\"T_0502a_row177_col3\" class=\"data row177 col3\" >False</td>\n", - " <td id=\"T_0502a_row177_col4\" class=\"data row177 col4\" >True</td>\n", - " <td id=\"T_0502a_row177_col5\" class=\"data row177 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row177_col6\" class=\"data row177 col6\" >{'num_tests': {'type': '_empty', 'default': 10}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row177_col7\" class=\"data row177 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row177_col8\" class=\"data row177 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row178_col0\" class=\"data row178 col0\" >validmind.prompt_validation.Specificity</td>\n", - " <td id=\"T_0502a_row178_col1\" class=\"data row178 col1\" >Specificity</td>\n", - " <td id=\"T_0502a_row178_col2\" class=\"data row178 col2\" >Evaluates and scores the specificity of prompts provided to a Large Language Model (LLM), based on clarity, detail,...</td>\n", - " <td id=\"T_0502a_row178_col3\" class=\"data row178 col3\" >False</td>\n", - " <td id=\"T_0502a_row178_col4\" class=\"data row178 col4\" >True</td>\n", - " <td id=\"T_0502a_row178_col5\" class=\"data row178 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row178_col6\" class=\"data row178 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row178_col7\" class=\"data row178 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row178_col8\" class=\"data row178 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row179_col0\" class=\"data row179 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", - " <td id=\"T_0502a_row179_col1\" class=\"data row179 col1\" >Accuracy</td>\n", - " <td id=\"T_0502a_row179_col2\" class=\"data row179 col2\" >Calculates the accuracy of a model</td>\n", - " <td id=\"T_0502a_row179_col3\" class=\"data row179 col3\" >False</td>\n", - " <td id=\"T_0502a_row179_col4\" class=\"data row179 col4\" >False</td>\n", - " <td id=\"T_0502a_row179_col5\" class=\"data row179 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row179_col6\" class=\"data row179 col6\" >{}</td>\n", - " <td id=\"T_0502a_row179_col7\" class=\"data row179 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row179_col8\" class=\"data row179 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row180_col0\" class=\"data row180 col0\" >validmind.unit_metrics.classification.F1</td>\n", - " <td id=\"T_0502a_row180_col1\" class=\"data row180 col1\" >F1</td>\n", - " <td id=\"T_0502a_row180_col2\" class=\"data row180 col2\" >Calculates the F1 score for a classification model.</td>\n", - " <td id=\"T_0502a_row180_col3\" class=\"data row180 col3\" >False</td>\n", - " <td id=\"T_0502a_row180_col4\" class=\"data row180 col4\" >False</td>\n", - " <td id=\"T_0502a_row180_col5\" class=\"data row180 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row180_col6\" class=\"data row180 col6\" >{}</td>\n", - " <td id=\"T_0502a_row180_col7\" class=\"data row180 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row180_col8\" class=\"data row180 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row181_col0\" class=\"data row181 col0\" >validmind.unit_metrics.classification.Precision</td>\n", - " <td id=\"T_0502a_row181_col1\" class=\"data row181 col1\" >Precision</td>\n", - " <td id=\"T_0502a_row181_col2\" class=\"data row181 col2\" >Calculates the precision for a classification model.</td>\n", - " <td id=\"T_0502a_row181_col3\" class=\"data row181 col3\" >False</td>\n", - " <td id=\"T_0502a_row181_col4\" class=\"data row181 col4\" >False</td>\n", - " <td id=\"T_0502a_row181_col5\" class=\"data row181 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row181_col6\" class=\"data row181 col6\" >{}</td>\n", - " <td id=\"T_0502a_row181_col7\" class=\"data row181 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row181_col8\" class=\"data row181 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row182_col0\" class=\"data row182 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", - " <td id=\"T_0502a_row182_col1\" class=\"data row182 col1\" >ROC AUC</td>\n", - " <td id=\"T_0502a_row182_col2\" class=\"data row182 col2\" >Calculates the ROC AUC for a classification model.</td>\n", - " <td id=\"T_0502a_row182_col3\" class=\"data row182 col3\" >False</td>\n", - " <td id=\"T_0502a_row182_col4\" class=\"data row182 col4\" >False</td>\n", - " <td id=\"T_0502a_row182_col5\" class=\"data row182 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row182_col6\" class=\"data row182 col6\" >{}</td>\n", - " <td id=\"T_0502a_row182_col7\" class=\"data row182 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row182_col8\" class=\"data row182 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row183_col0\" class=\"data row183 col0\" >validmind.unit_metrics.classification.Recall</td>\n", - " <td id=\"T_0502a_row183_col1\" class=\"data row183 col1\" >Recall</td>\n", - " <td id=\"T_0502a_row183_col2\" class=\"data row183 col2\" >Calculates the recall for a classification model.</td>\n", - " <td id=\"T_0502a_row183_col3\" class=\"data row183 col3\" >False</td>\n", - " <td id=\"T_0502a_row183_col4\" class=\"data row183 col4\" >False</td>\n", - " <td id=\"T_0502a_row183_col5\" class=\"data row183 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row183_col6\" class=\"data row183 col6\" >{}</td>\n", - " <td id=\"T_0502a_row183_col7\" class=\"data row183 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row183_col8\" class=\"data row183 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row184_col0\" class=\"data row184 col0\" >validmind.unit_metrics.regression.AdjustedRSquaredScore</td>\n", - " <td id=\"T_0502a_row184_col1\" class=\"data row184 col1\" >Adjusted R Squared Score</td>\n", - " <td id=\"T_0502a_row184_col2\" class=\"data row184 col2\" >Calculates the adjusted R-squared score for a regression model.</td>\n", - " <td id=\"T_0502a_row184_col3\" class=\"data row184 col3\" >False</td>\n", - " <td id=\"T_0502a_row184_col4\" class=\"data row184 col4\" >False</td>\n", - " <td id=\"T_0502a_row184_col5\" class=\"data row184 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row184_col6\" class=\"data row184 col6\" >{}</td>\n", - " <td id=\"T_0502a_row184_col7\" class=\"data row184 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row184_col8\" class=\"data row184 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row185_col0\" class=\"data row185 col0\" >validmind.unit_metrics.regression.GiniCoefficient</td>\n", - " <td id=\"T_0502a_row185_col1\" class=\"data row185 col1\" >Gini Coefficient</td>\n", - " <td id=\"T_0502a_row185_col2\" class=\"data row185 col2\" >Calculates the Gini coefficient for a regression model.</td>\n", - " <td id=\"T_0502a_row185_col3\" class=\"data row185 col3\" >False</td>\n", - " <td id=\"T_0502a_row185_col4\" class=\"data row185 col4\" >False</td>\n", - " <td id=\"T_0502a_row185_col5\" class=\"data row185 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row185_col6\" class=\"data row185 col6\" >{}</td>\n", - " <td id=\"T_0502a_row185_col7\" class=\"data row185 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row185_col8\" class=\"data row185 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row186_col0\" class=\"data row186 col0\" >validmind.unit_metrics.regression.HuberLoss</td>\n", - " <td id=\"T_0502a_row186_col1\" class=\"data row186 col1\" >Huber Loss</td>\n", - " <td id=\"T_0502a_row186_col2\" class=\"data row186 col2\" >Calculates the Huber loss for a regression model.</td>\n", - " <td id=\"T_0502a_row186_col3\" class=\"data row186 col3\" >False</td>\n", - " <td id=\"T_0502a_row186_col4\" class=\"data row186 col4\" >False</td>\n", - " <td id=\"T_0502a_row186_col5\" class=\"data row186 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row186_col6\" class=\"data row186 col6\" >{}</td>\n", - " <td id=\"T_0502a_row186_col7\" class=\"data row186 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row186_col8\" class=\"data row186 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row187_col0\" class=\"data row187 col0\" >validmind.unit_metrics.regression.KolmogorovSmirnovStatistic</td>\n", - " <td id=\"T_0502a_row187_col1\" class=\"data row187 col1\" >Kolmogorov Smirnov Statistic</td>\n", - " <td id=\"T_0502a_row187_col2\" class=\"data row187 col2\" >Calculates the Kolmogorov-Smirnov statistic for a regression model.</td>\n", - " <td id=\"T_0502a_row187_col3\" class=\"data row187 col3\" >False</td>\n", - " <td id=\"T_0502a_row187_col4\" class=\"data row187 col4\" >False</td>\n", - " <td id=\"T_0502a_row187_col5\" class=\"data row187 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row187_col6\" class=\"data row187 col6\" >{}</td>\n", - " <td id=\"T_0502a_row187_col7\" class=\"data row187 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row187_col8\" class=\"data row187 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row188_col0\" class=\"data row188 col0\" >validmind.unit_metrics.regression.MeanAbsoluteError</td>\n", - " <td id=\"T_0502a_row188_col1\" class=\"data row188 col1\" >Mean Absolute Error</td>\n", - " <td id=\"T_0502a_row188_col2\" class=\"data row188 col2\" >Calculates the mean absolute error for a regression model.</td>\n", - " <td id=\"T_0502a_row188_col3\" class=\"data row188 col3\" >False</td>\n", - " <td id=\"T_0502a_row188_col4\" class=\"data row188 col4\" >False</td>\n", - " <td id=\"T_0502a_row188_col5\" class=\"data row188 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row188_col6\" class=\"data row188 col6\" >{}</td>\n", - " <td id=\"T_0502a_row188_col7\" class=\"data row188 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row188_col8\" class=\"data row188 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row189_col0\" class=\"data row189 col0\" >validmind.unit_metrics.regression.MeanAbsolutePercentageError</td>\n", - " <td id=\"T_0502a_row189_col1\" class=\"data row189 col1\" >Mean Absolute Percentage Error</td>\n", - " <td id=\"T_0502a_row189_col2\" class=\"data row189 col2\" >Calculates the mean absolute percentage error for a regression model.</td>\n", - " <td id=\"T_0502a_row189_col3\" class=\"data row189 col3\" >False</td>\n", - " <td id=\"T_0502a_row189_col4\" class=\"data row189 col4\" >False</td>\n", - " <td id=\"T_0502a_row189_col5\" class=\"data row189 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row189_col6\" class=\"data row189 col6\" >{}</td>\n", - " <td id=\"T_0502a_row189_col7\" class=\"data row189 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row189_col8\" class=\"data row189 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row190_col0\" class=\"data row190 col0\" >validmind.unit_metrics.regression.MeanBiasDeviation</td>\n", - " <td id=\"T_0502a_row190_col1\" class=\"data row190 col1\" >Mean Bias Deviation</td>\n", - " <td id=\"T_0502a_row190_col2\" class=\"data row190 col2\" >Calculates the mean bias deviation for a regression model.</td>\n", - " <td id=\"T_0502a_row190_col3\" class=\"data row190 col3\" >False</td>\n", - " <td id=\"T_0502a_row190_col4\" class=\"data row190 col4\" >False</td>\n", - " <td id=\"T_0502a_row190_col5\" class=\"data row190 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row190_col6\" class=\"data row190 col6\" >{}</td>\n", - " <td id=\"T_0502a_row190_col7\" class=\"data row190 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row190_col8\" class=\"data row190 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row191_col0\" class=\"data row191 col0\" >validmind.unit_metrics.regression.MeanSquaredError</td>\n", - " <td id=\"T_0502a_row191_col1\" class=\"data row191 col1\" >Mean Squared Error</td>\n", - " <td id=\"T_0502a_row191_col2\" class=\"data row191 col2\" >Calculates the mean squared error for a regression model.</td>\n", - " <td id=\"T_0502a_row191_col3\" class=\"data row191 col3\" >False</td>\n", - " <td id=\"T_0502a_row191_col4\" class=\"data row191 col4\" >False</td>\n", - " <td id=\"T_0502a_row191_col5\" class=\"data row191 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row191_col6\" class=\"data row191 col6\" >{}</td>\n", - " <td id=\"T_0502a_row191_col7\" class=\"data row191 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row191_col8\" class=\"data row191 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row192_col0\" class=\"data row192 col0\" >validmind.unit_metrics.regression.QuantileLoss</td>\n", - " <td id=\"T_0502a_row192_col1\" class=\"data row192 col1\" >Quantile Loss</td>\n", - " <td id=\"T_0502a_row192_col2\" class=\"data row192 col2\" >Calculates the quantile loss for a regression model.</td>\n", - " <td id=\"T_0502a_row192_col3\" class=\"data row192 col3\" >False</td>\n", - " <td id=\"T_0502a_row192_col4\" class=\"data row192 col4\" >False</td>\n", - " <td id=\"T_0502a_row192_col5\" class=\"data row192 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row192_col6\" class=\"data row192 col6\" >{'quantile': {'type': '_empty', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row192_col7\" class=\"data row192 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row192_col8\" class=\"data row192 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row193_col0\" class=\"data row193 col0\" >validmind.unit_metrics.regression.RSquaredScore</td>\n", - " <td id=\"T_0502a_row193_col1\" class=\"data row193 col1\" >R Squared Score</td>\n", - " <td id=\"T_0502a_row193_col2\" class=\"data row193 col2\" >Calculates the R-squared score for a regression model.</td>\n", - " <td id=\"T_0502a_row193_col3\" class=\"data row193 col3\" >False</td>\n", - " <td id=\"T_0502a_row193_col4\" class=\"data row193 col4\" >False</td>\n", - " <td id=\"T_0502a_row193_col5\" class=\"data row193 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row193_col6\" class=\"data row193 col6\" >{}</td>\n", - " <td id=\"T_0502a_row193_col7\" class=\"data row193 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row193_col8\" class=\"data row193 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row194_col0\" class=\"data row194 col0\" >validmind.unit_metrics.regression.RootMeanSquaredError</td>\n", - " <td id=\"T_0502a_row194_col1\" class=\"data row194 col1\" >Root Mean Squared Error</td>\n", - " <td id=\"T_0502a_row194_col2\" class=\"data row194 col2\" >Calculates the root mean squared error for a regression model.</td>\n", - " <td id=\"T_0502a_row194_col3\" class=\"data row194 col3\" >False</td>\n", - " <td id=\"T_0502a_row194_col4\" class=\"data row194 col4\" >False</td>\n", - " <td id=\"T_0502a_row194_col5\" class=\"data row194 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row194_col6\" class=\"data row194 col6\" >{}</td>\n", - " <td id=\"T_0502a_row194_col7\" class=\"data row194 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row194_col8\" class=\"data row194 col8\" >['regression']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Explore tests\n", + "\n", + "Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Install the ValidMind Library](#toc2__) \n", + "- [List all available tests](#toc3__) \n", + "- [Understand tags and task types](#toc4__) \n", + "- [Filter tests by tags and task types](#toc5__) \n", + "- [Store test sets for use](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Discover more learning resources](#toc7_1__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x38000a670>" + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## List all available tests" ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Understand tags and task types\n", - "\n", - "Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.\n", - "\n", - "Understanding `task` types helps you filter tests that match your record's (such as a model) objective. For example:\n", - "\n", - "- **classification:** Works with Classification Models and Datasets.\n", - "- **regression:** Works with Regression Models and Datasets.\n", - "- **text classification:** Works with Text Classification Models and Datasets.\n", - "- **text summarization:** Works with Text Summarization Models and Datasets." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/plain": [ - "['text_qa',\n", - " 'classification',\n", - " 'data_validation',\n", - " 'text_classification',\n", - " 'feature_extraction',\n", - " 'regression',\n", - " 'visualization',\n", - " 'clustering',\n", - " 'time_series_forecasting',\n", - " 'text_summarization',\n", - " 'nlp',\n", - " 'residual_analysis',\n", - " 'monitoring',\n", - " 'text_generation']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Start by importing the functions from the [validmind.tests](https://docs.validmind.ai/validmind/validmind/tests.html) module for listing tests, listing tasks, listing tags, and listing tasks and tags to access these functions in the rest of this notebook:" ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tasks()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library.\n", - "\n", - "`Tags` describe what a test applies to and help you filter tests for your use case. Examples include:\n", - "\n", - "- **llm:** Tests that work with Large Language Models.\n", - "- **nlp:** Tests relevant for natural language processing.\n", - "- **binary_classification:** Tests for binary classification tasks.\n", - "- **forecasting:** Tests for forecasting and time-series analysis.\n", - "- **tabular_data:** Tests for tabular data like CSVs and Excel spreadsheets." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import (\n", + " list_tests,\n", + " list_tasks,\n", + " list_tags,\n", + " list_tasks_and_tags,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, { - "data": { - "text/plain": [ - "['senstivity_analysis',\n", - " 'calibration',\n", - " 'clustering',\n", - " 'anomaly_detection',\n", - " 'nlp',\n", - " 'classification_metrics',\n", - " 'dimensionality_reduction',\n", - " 'tabular_data',\n", - " 'time_series_data',\n", - " 'model_predictions',\n", - " 'feature_selection',\n", - " 'correlation',\n", - " 'frequency_analysis',\n", - " 'embeddings',\n", - " 'regression',\n", - " 'llm',\n", - " 'statsmodels',\n", - " 'ragas',\n", - " 'model_performance',\n", - " 'model_validation',\n", - " 'rag_performance',\n", - " 'model_training',\n", - " 'qualitative',\n", - " 'classification',\n", - " 'kmeans',\n", - " 'multiclass_classification',\n", - " 'linear_regression',\n", - " 'data_quality',\n", - " 'text_data',\n", - " 'binary_classification',\n", - " 'threshold_optimization',\n", - " 'stationarity',\n", - " 'bias_and_fairness',\n", - " 'scorecard',\n", - " 'model_explainability',\n", - " 'model_comparison',\n", - " 'numerical_data',\n", - " 'sklearn',\n", - " 'model_selection',\n", - " 'retrieval_performance',\n", - " 'zero_shot',\n", - " 'statistical_test',\n", - " 'descriptive_statistics',\n", - " 'seasonality',\n", - " 'analysis',\n", - " 'data_validation',\n", - " 'data_distribution',\n", - " 'feature_importance',\n", - " 'metadata',\n", - " 'few_shot',\n", - " 'visualization',\n", - " 'credit_risk',\n", - " 'forecasting',\n", - " 'AUC',\n", - " 'logistic_regression',\n", - " 'model_diagnosis',\n", - " 'model_interpretation',\n", - " 'unit_root_test',\n", - " 'categorical_data',\n", - " 'data_analysis']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all available ValidMind tests, which returns a DataFrame with the following columns:\n", + "\n", + "- **ID** – A unique identifier for each test.\n", + "- **Name** – The test’s name.\n", + "- **Description** – A short summary of what the test evaluates.\n", + "- **Tags** – Keywords that describe what the test does or applies to.\n", + "- **Tasks** – The type of modeling task the test supports." ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, to match each task type with its related tags, use the [list_tasks_and_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) function:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_ac294 th {\n", - " text-align: left;\n", - "}\n", - "#T_ac294_row0_col0, #T_ac294_row0_col1, #T_ac294_row1_col0, #T_ac294_row1_col1, #T_ac294_row2_col0, #T_ac294_row2_col1, #T_ac294_row3_col0, #T_ac294_row3_col1, #T_ac294_row4_col0, #T_ac294_row4_col1, #T_ac294_row5_col0, #T_ac294_row5_col1, #T_ac294_row6_col0, #T_ac294_row6_col1, #T_ac294_row7_col0, #T_ac294_row7_col1, #T_ac294_row8_col0, #T_ac294_row8_col1, #T_ac294_row9_col0, #T_ac294_row9_col1, #T_ac294_row10_col0, #T_ac294_row10_col1, #T_ac294_row11_col0, #T_ac294_row11_col1, #T_ac294_row12_col0, #T_ac294_row12_col1, #T_ac294_row13_col0, #T_ac294_row13_col1 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_ac294\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_ac294_level0_col0\" class=\"col_heading level0 col0\" >Task</th>\n", - " <th id=\"T_ac294_level0_col1\" class=\"col_heading level0 col1\" >Tags</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_ac294_row0_col0\" class=\"data row0 col0\" >regression</td>\n", - " <td id=\"T_ac294_row0_col1\" class=\"data row0 col1\" >senstivity_analysis, tabular_data, time_series_data, model_predictions, feature_selection, correlation, regression, statsmodels, model_performance, model_training, multiclass_classification, linear_regression, data_quality, text_data, model_explainability, binary_classification, stationarity, bias_and_fairness, numerical_data, sklearn, model_selection, statistical_test, descriptive_statistics, seasonality, analysis, data_validation, data_distribution, metadata, feature_importance, visualization, forecasting, model_diagnosis, model_interpretation, unit_root_test, categorical_data, data_analysis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row1_col0\" class=\"data row1 col0\" >classification</td>\n", - " <td id=\"T_ac294_row1_col1\" class=\"data row1 col1\" >calibration, anomaly_detection, classification_metrics, tabular_data, time_series_data, feature_selection, correlation, statsmodels, model_performance, model_validation, model_training, classification, multiclass_classification, linear_regression, data_quality, text_data, binary_classification, threshold_optimization, bias_and_fairness, scorecard, model_comparison, numerical_data, sklearn, statistical_test, descriptive_statistics, feature_importance, data_distribution, metadata, visualization, credit_risk, AUC, logistic_regression, model_diagnosis, categorical_data, data_analysis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row2_col0\" class=\"data row2 col0\" >text_classification</td>\n", - " <td id=\"T_ac294_row2_col1\" class=\"data row2 col1\" >model_performance, feature_importance, multiclass_classification, few_shot, frequency_analysis, zero_shot, text_data, visualization, llm, binary_classification, ragas, model_diagnosis, model_comparison, sklearn, nlp, retrieval_performance, tabular_data, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row3_col0\" class=\"data row3 col0\" >text_summarization</td>\n", - " <td id=\"T_ac294_row3_col1\" class=\"data row3 col1\" >qualitative, few_shot, frequency_analysis, embeddings, zero_shot, text_data, visualization, llm, rag_performance, ragas, retrieval_performance, nlp, dimensionality_reduction, tabular_data, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row4_col0\" class=\"data row4 col0\" >data_validation</td>\n", - " <td id=\"T_ac294_row4_col1\" class=\"data row4 col1\" >stationarity, statsmodels, unit_root_test, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row5_col0\" class=\"data row5 col0\" >time_series_forecasting</td>\n", - " <td id=\"T_ac294_row5_col1\" class=\"data row5 col1\" >model_training, data_validation, metadata, visualization, model_explainability, sklearn, model_performance, model_predictions, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row6_col0\" class=\"data row6 col0\" >nlp</td>\n", - " <td id=\"T_ac294_row6_col1\" class=\"data row6 col1\" >data_validation, frequency_analysis, text_data, visualization, nlp</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row7_col0\" class=\"data row7 col0\" >clustering</td>\n", - " <td id=\"T_ac294_row7_col1\" class=\"data row7 col1\" >clustering, model_performance, kmeans, sklearn</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row8_col0\" class=\"data row8 col0\" >residual_analysis</td>\n", - " <td id=\"T_ac294_row8_col1\" class=\"data row8 col1\" >regression</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row9_col0\" class=\"data row9 col0\" >visualization</td>\n", - " <td id=\"T_ac294_row9_col1\" class=\"data row9 col1\" >regression</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row10_col0\" class=\"data row10 col0\" >feature_extraction</td>\n", - " <td id=\"T_ac294_row10_col1\" class=\"data row10 col1\" >embeddings, text_data, visualization, llm</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row11_col0\" class=\"data row11 col0\" >text_qa</td>\n", - " <td id=\"T_ac294_row11_col1\" class=\"data row11 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row12_col0\" class=\"data row12 col0\" >text_generation</td>\n", - " <td id=\"T_ac294_row12_col1\" class=\"data row12 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row13_col0\" class=\"data row13 col0\" >monitoring</td>\n", - " <td id=\"T_ac294_row13_col1\" class=\"data row13 col1\" >visualization</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tests()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x38000adc0>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_0502a th {\n", + " text-align: left;\n", + "}\n", + "#T_0502a_row0_col0, #T_0502a_row0_col1, #T_0502a_row0_col2, #T_0502a_row0_col3, #T_0502a_row0_col4, #T_0502a_row0_col5, #T_0502a_row0_col6, #T_0502a_row0_col7, #T_0502a_row0_col8, #T_0502a_row1_col0, #T_0502a_row1_col1, #T_0502a_row1_col2, #T_0502a_row1_col3, #T_0502a_row1_col4, #T_0502a_row1_col5, #T_0502a_row1_col6, #T_0502a_row1_col7, #T_0502a_row1_col8, #T_0502a_row2_col0, #T_0502a_row2_col1, #T_0502a_row2_col2, #T_0502a_row2_col3, #T_0502a_row2_col4, #T_0502a_row2_col5, #T_0502a_row2_col6, #T_0502a_row2_col7, #T_0502a_row2_col8, #T_0502a_row3_col0, #T_0502a_row3_col1, #T_0502a_row3_col2, #T_0502a_row3_col3, #T_0502a_row3_col4, #T_0502a_row3_col5, #T_0502a_row3_col6, #T_0502a_row3_col7, #T_0502a_row3_col8, #T_0502a_row4_col0, #T_0502a_row4_col1, #T_0502a_row4_col2, #T_0502a_row4_col3, #T_0502a_row4_col4, #T_0502a_row4_col5, #T_0502a_row4_col6, #T_0502a_row4_col7, #T_0502a_row4_col8, #T_0502a_row5_col0, #T_0502a_row5_col1, #T_0502a_row5_col2, #T_0502a_row5_col3, #T_0502a_row5_col4, #T_0502a_row5_col5, #T_0502a_row5_col6, #T_0502a_row5_col7, #T_0502a_row5_col8, #T_0502a_row6_col0, #T_0502a_row6_col1, #T_0502a_row6_col2, #T_0502a_row6_col3, #T_0502a_row6_col4, #T_0502a_row6_col5, #T_0502a_row6_col6, #T_0502a_row6_col7, #T_0502a_row6_col8, #T_0502a_row7_col0, #T_0502a_row7_col1, #T_0502a_row7_col2, #T_0502a_row7_col3, #T_0502a_row7_col4, #T_0502a_row7_col5, #T_0502a_row7_col6, #T_0502a_row7_col7, #T_0502a_row7_col8, #T_0502a_row8_col0, #T_0502a_row8_col1, #T_0502a_row8_col2, #T_0502a_row8_col3, #T_0502a_row8_col4, #T_0502a_row8_col5, #T_0502a_row8_col6, #T_0502a_row8_col7, #T_0502a_row8_col8, #T_0502a_row9_col0, #T_0502a_row9_col1, #T_0502a_row9_col2, #T_0502a_row9_col3, #T_0502a_row9_col4, #T_0502a_row9_col5, #T_0502a_row9_col6, #T_0502a_row9_col7, #T_0502a_row9_col8, #T_0502a_row10_col0, #T_0502a_row10_col1, #T_0502a_row10_col2, #T_0502a_row10_col3, #T_0502a_row10_col4, #T_0502a_row10_col5, #T_0502a_row10_col6, #T_0502a_row10_col7, #T_0502a_row10_col8, #T_0502a_row11_col0, #T_0502a_row11_col1, #T_0502a_row11_col2, #T_0502a_row11_col3, #T_0502a_row11_col4, #T_0502a_row11_col5, #T_0502a_row11_col6, #T_0502a_row11_col7, #T_0502a_row11_col8, #T_0502a_row12_col0, #T_0502a_row12_col1, #T_0502a_row12_col2, #T_0502a_row12_col3, #T_0502a_row12_col4, #T_0502a_row12_col5, #T_0502a_row12_col6, #T_0502a_row12_col7, #T_0502a_row12_col8, #T_0502a_row13_col0, #T_0502a_row13_col1, #T_0502a_row13_col2, #T_0502a_row13_col3, #T_0502a_row13_col4, #T_0502a_row13_col5, #T_0502a_row13_col6, #T_0502a_row13_col7, #T_0502a_row13_col8, #T_0502a_row14_col0, #T_0502a_row14_col1, #T_0502a_row14_col2, #T_0502a_row14_col3, #T_0502a_row14_col4, #T_0502a_row14_col5, #T_0502a_row14_col6, #T_0502a_row14_col7, #T_0502a_row14_col8, #T_0502a_row15_col0, #T_0502a_row15_col1, #T_0502a_row15_col2, #T_0502a_row15_col3, #T_0502a_row15_col4, #T_0502a_row15_col5, #T_0502a_row15_col6, #T_0502a_row15_col7, #T_0502a_row15_col8, #T_0502a_row16_col0, #T_0502a_row16_col1, #T_0502a_row16_col2, #T_0502a_row16_col3, #T_0502a_row16_col4, #T_0502a_row16_col5, #T_0502a_row16_col6, #T_0502a_row16_col7, #T_0502a_row16_col8, #T_0502a_row17_col0, #T_0502a_row17_col1, #T_0502a_row17_col2, #T_0502a_row17_col3, #T_0502a_row17_col4, #T_0502a_row17_col5, #T_0502a_row17_col6, #T_0502a_row17_col7, #T_0502a_row17_col8, #T_0502a_row18_col0, #T_0502a_row18_col1, #T_0502a_row18_col2, #T_0502a_row18_col3, #T_0502a_row18_col4, #T_0502a_row18_col5, #T_0502a_row18_col6, #T_0502a_row18_col7, #T_0502a_row18_col8, #T_0502a_row19_col0, #T_0502a_row19_col1, #T_0502a_row19_col2, #T_0502a_row19_col3, #T_0502a_row19_col4, #T_0502a_row19_col5, #T_0502a_row19_col6, #T_0502a_row19_col7, #T_0502a_row19_col8, #T_0502a_row20_col0, #T_0502a_row20_col1, #T_0502a_row20_col2, #T_0502a_row20_col3, #T_0502a_row20_col4, #T_0502a_row20_col5, #T_0502a_row20_col6, #T_0502a_row20_col7, #T_0502a_row20_col8, #T_0502a_row21_col0, #T_0502a_row21_col1, #T_0502a_row21_col2, #T_0502a_row21_col3, #T_0502a_row21_col4, #T_0502a_row21_col5, #T_0502a_row21_col6, #T_0502a_row21_col7, #T_0502a_row21_col8, #T_0502a_row22_col0, #T_0502a_row22_col1, #T_0502a_row22_col2, #T_0502a_row22_col3, #T_0502a_row22_col4, #T_0502a_row22_col5, #T_0502a_row22_col6, #T_0502a_row22_col7, #T_0502a_row22_col8, #T_0502a_row23_col0, #T_0502a_row23_col1, #T_0502a_row23_col2, #T_0502a_row23_col3, #T_0502a_row23_col4, #T_0502a_row23_col5, #T_0502a_row23_col6, #T_0502a_row23_col7, #T_0502a_row23_col8, #T_0502a_row24_col0, #T_0502a_row24_col1, #T_0502a_row24_col2, #T_0502a_row24_col3, #T_0502a_row24_col4, #T_0502a_row24_col5, #T_0502a_row24_col6, #T_0502a_row24_col7, #T_0502a_row24_col8, #T_0502a_row25_col0, #T_0502a_row25_col1, #T_0502a_row25_col2, #T_0502a_row25_col3, #T_0502a_row25_col4, #T_0502a_row25_col5, #T_0502a_row25_col6, #T_0502a_row25_col7, #T_0502a_row25_col8, #T_0502a_row26_col0, #T_0502a_row26_col1, #T_0502a_row26_col2, #T_0502a_row26_col3, #T_0502a_row26_col4, #T_0502a_row26_col5, #T_0502a_row26_col6, #T_0502a_row26_col7, #T_0502a_row26_col8, #T_0502a_row27_col0, #T_0502a_row27_col1, #T_0502a_row27_col2, #T_0502a_row27_col3, #T_0502a_row27_col4, #T_0502a_row27_col5, #T_0502a_row27_col6, #T_0502a_row27_col7, #T_0502a_row27_col8, #T_0502a_row28_col0, #T_0502a_row28_col1, #T_0502a_row28_col2, #T_0502a_row28_col3, #T_0502a_row28_col4, #T_0502a_row28_col5, #T_0502a_row28_col6, #T_0502a_row28_col7, #T_0502a_row28_col8, #T_0502a_row29_col0, #T_0502a_row29_col1, #T_0502a_row29_col2, #T_0502a_row29_col3, #T_0502a_row29_col4, #T_0502a_row29_col5, #T_0502a_row29_col6, #T_0502a_row29_col7, #T_0502a_row29_col8, #T_0502a_row30_col0, #T_0502a_row30_col1, #T_0502a_row30_col2, #T_0502a_row30_col3, #T_0502a_row30_col4, #T_0502a_row30_col5, #T_0502a_row30_col6, #T_0502a_row30_col7, #T_0502a_row30_col8, #T_0502a_row31_col0, #T_0502a_row31_col1, #T_0502a_row31_col2, #T_0502a_row31_col3, #T_0502a_row31_col4, #T_0502a_row31_col5, #T_0502a_row31_col6, #T_0502a_row31_col7, #T_0502a_row31_col8, #T_0502a_row32_col0, #T_0502a_row32_col1, #T_0502a_row32_col2, #T_0502a_row32_col3, #T_0502a_row32_col4, #T_0502a_row32_col5, #T_0502a_row32_col6, #T_0502a_row32_col7, #T_0502a_row32_col8, #T_0502a_row33_col0, #T_0502a_row33_col1, #T_0502a_row33_col2, #T_0502a_row33_col3, #T_0502a_row33_col4, #T_0502a_row33_col5, #T_0502a_row33_col6, #T_0502a_row33_col7, #T_0502a_row33_col8, #T_0502a_row34_col0, #T_0502a_row34_col1, #T_0502a_row34_col2, #T_0502a_row34_col3, #T_0502a_row34_col4, #T_0502a_row34_col5, #T_0502a_row34_col6, #T_0502a_row34_col7, #T_0502a_row34_col8, #T_0502a_row35_col0, #T_0502a_row35_col1, #T_0502a_row35_col2, #T_0502a_row35_col3, #T_0502a_row35_col4, #T_0502a_row35_col5, #T_0502a_row35_col6, #T_0502a_row35_col7, #T_0502a_row35_col8, #T_0502a_row36_col0, #T_0502a_row36_col1, #T_0502a_row36_col2, #T_0502a_row36_col3, #T_0502a_row36_col4, #T_0502a_row36_col5, #T_0502a_row36_col6, #T_0502a_row36_col7, #T_0502a_row36_col8, #T_0502a_row37_col0, #T_0502a_row37_col1, #T_0502a_row37_col2, #T_0502a_row37_col3, #T_0502a_row37_col4, #T_0502a_row37_col5, #T_0502a_row37_col6, #T_0502a_row37_col7, #T_0502a_row37_col8, #T_0502a_row38_col0, #T_0502a_row38_col1, #T_0502a_row38_col2, #T_0502a_row38_col3, #T_0502a_row38_col4, #T_0502a_row38_col5, #T_0502a_row38_col6, #T_0502a_row38_col7, #T_0502a_row38_col8, #T_0502a_row39_col0, #T_0502a_row39_col1, #T_0502a_row39_col2, #T_0502a_row39_col3, #T_0502a_row39_col4, #T_0502a_row39_col5, #T_0502a_row39_col6, #T_0502a_row39_col7, #T_0502a_row39_col8, #T_0502a_row40_col0, #T_0502a_row40_col1, #T_0502a_row40_col2, #T_0502a_row40_col3, #T_0502a_row40_col4, #T_0502a_row40_col5, #T_0502a_row40_col6, #T_0502a_row40_col7, #T_0502a_row40_col8, #T_0502a_row41_col0, #T_0502a_row41_col1, #T_0502a_row41_col2, #T_0502a_row41_col3, #T_0502a_row41_col4, #T_0502a_row41_col5, #T_0502a_row41_col6, #T_0502a_row41_col7, #T_0502a_row41_col8, #T_0502a_row42_col0, #T_0502a_row42_col1, #T_0502a_row42_col2, #T_0502a_row42_col3, #T_0502a_row42_col4, #T_0502a_row42_col5, #T_0502a_row42_col6, #T_0502a_row42_col7, #T_0502a_row42_col8, #T_0502a_row43_col0, #T_0502a_row43_col1, #T_0502a_row43_col2, #T_0502a_row43_col3, #T_0502a_row43_col4, #T_0502a_row43_col5, #T_0502a_row43_col6, #T_0502a_row43_col7, #T_0502a_row43_col8, #T_0502a_row44_col0, #T_0502a_row44_col1, #T_0502a_row44_col2, #T_0502a_row44_col3, #T_0502a_row44_col4, #T_0502a_row44_col5, #T_0502a_row44_col6, #T_0502a_row44_col7, #T_0502a_row44_col8, #T_0502a_row45_col0, #T_0502a_row45_col1, #T_0502a_row45_col2, #T_0502a_row45_col3, #T_0502a_row45_col4, #T_0502a_row45_col5, #T_0502a_row45_col6, #T_0502a_row45_col7, #T_0502a_row45_col8, #T_0502a_row46_col0, #T_0502a_row46_col1, #T_0502a_row46_col2, #T_0502a_row46_col3, #T_0502a_row46_col4, #T_0502a_row46_col5, #T_0502a_row46_col6, #T_0502a_row46_col7, #T_0502a_row46_col8, #T_0502a_row47_col0, #T_0502a_row47_col1, #T_0502a_row47_col2, #T_0502a_row47_col3, #T_0502a_row47_col4, #T_0502a_row47_col5, #T_0502a_row47_col6, #T_0502a_row47_col7, #T_0502a_row47_col8, #T_0502a_row48_col0, #T_0502a_row48_col1, #T_0502a_row48_col2, #T_0502a_row48_col3, #T_0502a_row48_col4, #T_0502a_row48_col5, #T_0502a_row48_col6, #T_0502a_row48_col7, #T_0502a_row48_col8, #T_0502a_row49_col0, #T_0502a_row49_col1, #T_0502a_row49_col2, #T_0502a_row49_col3, #T_0502a_row49_col4, #T_0502a_row49_col5, #T_0502a_row49_col6, #T_0502a_row49_col7, #T_0502a_row49_col8, #T_0502a_row50_col0, #T_0502a_row50_col1, #T_0502a_row50_col2, #T_0502a_row50_col3, #T_0502a_row50_col4, #T_0502a_row50_col5, #T_0502a_row50_col6, #T_0502a_row50_col7, #T_0502a_row50_col8, #T_0502a_row51_col0, #T_0502a_row51_col1, #T_0502a_row51_col2, #T_0502a_row51_col3, #T_0502a_row51_col4, #T_0502a_row51_col5, #T_0502a_row51_col6, #T_0502a_row51_col7, #T_0502a_row51_col8, #T_0502a_row52_col0, #T_0502a_row52_col1, #T_0502a_row52_col2, #T_0502a_row52_col3, #T_0502a_row52_col4, #T_0502a_row52_col5, #T_0502a_row52_col6, #T_0502a_row52_col7, #T_0502a_row52_col8, #T_0502a_row53_col0, #T_0502a_row53_col1, #T_0502a_row53_col2, #T_0502a_row53_col3, #T_0502a_row53_col4, #T_0502a_row53_col5, #T_0502a_row53_col6, #T_0502a_row53_col7, #T_0502a_row53_col8, #T_0502a_row54_col0, #T_0502a_row54_col1, #T_0502a_row54_col2, #T_0502a_row54_col3, #T_0502a_row54_col4, #T_0502a_row54_col5, #T_0502a_row54_col6, #T_0502a_row54_col7, #T_0502a_row54_col8, #T_0502a_row55_col0, #T_0502a_row55_col1, #T_0502a_row55_col2, #T_0502a_row55_col3, #T_0502a_row55_col4, #T_0502a_row55_col5, #T_0502a_row55_col6, #T_0502a_row55_col7, #T_0502a_row55_col8, #T_0502a_row56_col0, #T_0502a_row56_col1, #T_0502a_row56_col2, #T_0502a_row56_col3, #T_0502a_row56_col4, #T_0502a_row56_col5, #T_0502a_row56_col6, #T_0502a_row56_col7, #T_0502a_row56_col8, #T_0502a_row57_col0, #T_0502a_row57_col1, #T_0502a_row57_col2, #T_0502a_row57_col3, #T_0502a_row57_col4, #T_0502a_row57_col5, #T_0502a_row57_col6, #T_0502a_row57_col7, #T_0502a_row57_col8, #T_0502a_row58_col0, #T_0502a_row58_col1, #T_0502a_row58_col2, #T_0502a_row58_col3, #T_0502a_row58_col4, #T_0502a_row58_col5, #T_0502a_row58_col6, #T_0502a_row58_col7, #T_0502a_row58_col8, #T_0502a_row59_col0, #T_0502a_row59_col1, #T_0502a_row59_col2, #T_0502a_row59_col3, #T_0502a_row59_col4, #T_0502a_row59_col5, #T_0502a_row59_col6, #T_0502a_row59_col7, #T_0502a_row59_col8, #T_0502a_row60_col0, #T_0502a_row60_col1, #T_0502a_row60_col2, #T_0502a_row60_col3, #T_0502a_row60_col4, #T_0502a_row60_col5, #T_0502a_row60_col6, #T_0502a_row60_col7, #T_0502a_row60_col8, #T_0502a_row61_col0, #T_0502a_row61_col1, #T_0502a_row61_col2, #T_0502a_row61_col3, #T_0502a_row61_col4, #T_0502a_row61_col5, #T_0502a_row61_col6, #T_0502a_row61_col7, #T_0502a_row61_col8, #T_0502a_row62_col0, #T_0502a_row62_col1, #T_0502a_row62_col2, #T_0502a_row62_col3, #T_0502a_row62_col4, #T_0502a_row62_col5, #T_0502a_row62_col6, #T_0502a_row62_col7, #T_0502a_row62_col8, #T_0502a_row63_col0, #T_0502a_row63_col1, #T_0502a_row63_col2, #T_0502a_row63_col3, #T_0502a_row63_col4, #T_0502a_row63_col5, #T_0502a_row63_col6, #T_0502a_row63_col7, #T_0502a_row63_col8, #T_0502a_row64_col0, #T_0502a_row64_col1, #T_0502a_row64_col2, #T_0502a_row64_col3, #T_0502a_row64_col4, #T_0502a_row64_col5, #T_0502a_row64_col6, #T_0502a_row64_col7, #T_0502a_row64_col8, #T_0502a_row65_col0, #T_0502a_row65_col1, #T_0502a_row65_col2, #T_0502a_row65_col3, #T_0502a_row65_col4, #T_0502a_row65_col5, #T_0502a_row65_col6, #T_0502a_row65_col7, #T_0502a_row65_col8, #T_0502a_row66_col0, #T_0502a_row66_col1, #T_0502a_row66_col2, #T_0502a_row66_col3, #T_0502a_row66_col4, #T_0502a_row66_col5, #T_0502a_row66_col6, #T_0502a_row66_col7, #T_0502a_row66_col8, #T_0502a_row67_col0, #T_0502a_row67_col1, #T_0502a_row67_col2, #T_0502a_row67_col3, #T_0502a_row67_col4, #T_0502a_row67_col5, #T_0502a_row67_col6, #T_0502a_row67_col7, #T_0502a_row67_col8, #T_0502a_row68_col0, #T_0502a_row68_col1, #T_0502a_row68_col2, #T_0502a_row68_col3, #T_0502a_row68_col4, #T_0502a_row68_col5, #T_0502a_row68_col6, #T_0502a_row68_col7, #T_0502a_row68_col8, #T_0502a_row69_col0, #T_0502a_row69_col1, #T_0502a_row69_col2, #T_0502a_row69_col3, #T_0502a_row69_col4, #T_0502a_row69_col5, #T_0502a_row69_col6, #T_0502a_row69_col7, #T_0502a_row69_col8, #T_0502a_row70_col0, #T_0502a_row70_col1, #T_0502a_row70_col2, #T_0502a_row70_col3, #T_0502a_row70_col4, #T_0502a_row70_col5, #T_0502a_row70_col6, #T_0502a_row70_col7, #T_0502a_row70_col8, #T_0502a_row71_col0, #T_0502a_row71_col1, #T_0502a_row71_col2, #T_0502a_row71_col3, #T_0502a_row71_col4, #T_0502a_row71_col5, #T_0502a_row71_col6, #T_0502a_row71_col7, #T_0502a_row71_col8, #T_0502a_row72_col0, #T_0502a_row72_col1, #T_0502a_row72_col2, #T_0502a_row72_col3, #T_0502a_row72_col4, #T_0502a_row72_col5, #T_0502a_row72_col6, #T_0502a_row72_col7, #T_0502a_row72_col8, #T_0502a_row73_col0, #T_0502a_row73_col1, #T_0502a_row73_col2, #T_0502a_row73_col3, #T_0502a_row73_col4, #T_0502a_row73_col5, #T_0502a_row73_col6, #T_0502a_row73_col7, #T_0502a_row73_col8, #T_0502a_row74_col0, #T_0502a_row74_col1, #T_0502a_row74_col2, #T_0502a_row74_col3, #T_0502a_row74_col4, #T_0502a_row74_col5, #T_0502a_row74_col6, #T_0502a_row74_col7, #T_0502a_row74_col8, #T_0502a_row75_col0, #T_0502a_row75_col1, #T_0502a_row75_col2, #T_0502a_row75_col3, #T_0502a_row75_col4, #T_0502a_row75_col5, #T_0502a_row75_col6, #T_0502a_row75_col7, #T_0502a_row75_col8, #T_0502a_row76_col0, #T_0502a_row76_col1, #T_0502a_row76_col2, #T_0502a_row76_col3, #T_0502a_row76_col4, #T_0502a_row76_col5, #T_0502a_row76_col6, #T_0502a_row76_col7, #T_0502a_row76_col8, #T_0502a_row77_col0, #T_0502a_row77_col1, #T_0502a_row77_col2, #T_0502a_row77_col3, #T_0502a_row77_col4, #T_0502a_row77_col5, #T_0502a_row77_col6, #T_0502a_row77_col7, #T_0502a_row77_col8, #T_0502a_row78_col0, #T_0502a_row78_col1, #T_0502a_row78_col2, #T_0502a_row78_col3, #T_0502a_row78_col4, #T_0502a_row78_col5, #T_0502a_row78_col6, #T_0502a_row78_col7, #T_0502a_row78_col8, #T_0502a_row79_col0, #T_0502a_row79_col1, #T_0502a_row79_col2, #T_0502a_row79_col3, #T_0502a_row79_col4, #T_0502a_row79_col5, #T_0502a_row79_col6, #T_0502a_row79_col7, #T_0502a_row79_col8, #T_0502a_row80_col0, #T_0502a_row80_col1, #T_0502a_row80_col2, #T_0502a_row80_col3, #T_0502a_row80_col4, #T_0502a_row80_col5, #T_0502a_row80_col6, #T_0502a_row80_col7, #T_0502a_row80_col8, #T_0502a_row81_col0, #T_0502a_row81_col1, #T_0502a_row81_col2, #T_0502a_row81_col3, #T_0502a_row81_col4, #T_0502a_row81_col5, #T_0502a_row81_col6, #T_0502a_row81_col7, #T_0502a_row81_col8, #T_0502a_row82_col0, #T_0502a_row82_col1, #T_0502a_row82_col2, #T_0502a_row82_col3, #T_0502a_row82_col4, #T_0502a_row82_col5, #T_0502a_row82_col6, #T_0502a_row82_col7, #T_0502a_row82_col8, #T_0502a_row83_col0, #T_0502a_row83_col1, #T_0502a_row83_col2, #T_0502a_row83_col3, #T_0502a_row83_col4, #T_0502a_row83_col5, #T_0502a_row83_col6, #T_0502a_row83_col7, #T_0502a_row83_col8, #T_0502a_row84_col0, #T_0502a_row84_col1, #T_0502a_row84_col2, #T_0502a_row84_col3, #T_0502a_row84_col4, #T_0502a_row84_col5, #T_0502a_row84_col6, #T_0502a_row84_col7, #T_0502a_row84_col8, #T_0502a_row85_col0, #T_0502a_row85_col1, #T_0502a_row85_col2, #T_0502a_row85_col3, #T_0502a_row85_col4, #T_0502a_row85_col5, #T_0502a_row85_col6, #T_0502a_row85_col7, #T_0502a_row85_col8, #T_0502a_row86_col0, #T_0502a_row86_col1, #T_0502a_row86_col2, #T_0502a_row86_col3, #T_0502a_row86_col4, #T_0502a_row86_col5, #T_0502a_row86_col6, #T_0502a_row86_col7, #T_0502a_row86_col8, #T_0502a_row87_col0, #T_0502a_row87_col1, #T_0502a_row87_col2, #T_0502a_row87_col3, #T_0502a_row87_col4, #T_0502a_row87_col5, #T_0502a_row87_col6, #T_0502a_row87_col7, #T_0502a_row87_col8, #T_0502a_row88_col0, #T_0502a_row88_col1, #T_0502a_row88_col2, #T_0502a_row88_col3, #T_0502a_row88_col4, #T_0502a_row88_col5, #T_0502a_row88_col6, #T_0502a_row88_col7, #T_0502a_row88_col8, #T_0502a_row89_col0, #T_0502a_row89_col1, #T_0502a_row89_col2, #T_0502a_row89_col3, #T_0502a_row89_col4, #T_0502a_row89_col5, #T_0502a_row89_col6, #T_0502a_row89_col7, #T_0502a_row89_col8, #T_0502a_row90_col0, #T_0502a_row90_col1, #T_0502a_row90_col2, #T_0502a_row90_col3, #T_0502a_row90_col4, #T_0502a_row90_col5, #T_0502a_row90_col6, #T_0502a_row90_col7, #T_0502a_row90_col8, #T_0502a_row91_col0, #T_0502a_row91_col1, #T_0502a_row91_col2, #T_0502a_row91_col3, #T_0502a_row91_col4, #T_0502a_row91_col5, #T_0502a_row91_col6, #T_0502a_row91_col7, #T_0502a_row91_col8, #T_0502a_row92_col0, #T_0502a_row92_col1, #T_0502a_row92_col2, #T_0502a_row92_col3, #T_0502a_row92_col4, #T_0502a_row92_col5, #T_0502a_row92_col6, #T_0502a_row92_col7, #T_0502a_row92_col8, #T_0502a_row93_col0, #T_0502a_row93_col1, #T_0502a_row93_col2, #T_0502a_row93_col3, #T_0502a_row93_col4, #T_0502a_row93_col5, #T_0502a_row93_col6, #T_0502a_row93_col7, #T_0502a_row93_col8, #T_0502a_row94_col0, #T_0502a_row94_col1, #T_0502a_row94_col2, #T_0502a_row94_col3, #T_0502a_row94_col4, #T_0502a_row94_col5, #T_0502a_row94_col6, #T_0502a_row94_col7, #T_0502a_row94_col8, #T_0502a_row95_col0, #T_0502a_row95_col1, #T_0502a_row95_col2, #T_0502a_row95_col3, #T_0502a_row95_col4, #T_0502a_row95_col5, #T_0502a_row95_col6, #T_0502a_row95_col7, #T_0502a_row95_col8, #T_0502a_row96_col0, #T_0502a_row96_col1, #T_0502a_row96_col2, #T_0502a_row96_col3, #T_0502a_row96_col4, #T_0502a_row96_col5, #T_0502a_row96_col6, #T_0502a_row96_col7, #T_0502a_row96_col8, #T_0502a_row97_col0, #T_0502a_row97_col1, #T_0502a_row97_col2, #T_0502a_row97_col3, #T_0502a_row97_col4, #T_0502a_row97_col5, #T_0502a_row97_col6, #T_0502a_row97_col7, #T_0502a_row97_col8, #T_0502a_row98_col0, #T_0502a_row98_col1, #T_0502a_row98_col2, #T_0502a_row98_col3, #T_0502a_row98_col4, #T_0502a_row98_col5, #T_0502a_row98_col6, #T_0502a_row98_col7, #T_0502a_row98_col8, #T_0502a_row99_col0, #T_0502a_row99_col1, #T_0502a_row99_col2, #T_0502a_row99_col3, #T_0502a_row99_col4, #T_0502a_row99_col5, #T_0502a_row99_col6, #T_0502a_row99_col7, #T_0502a_row99_col8, #T_0502a_row100_col0, #T_0502a_row100_col1, #T_0502a_row100_col2, #T_0502a_row100_col3, #T_0502a_row100_col4, #T_0502a_row100_col5, #T_0502a_row100_col6, #T_0502a_row100_col7, #T_0502a_row100_col8, #T_0502a_row101_col0, #T_0502a_row101_col1, #T_0502a_row101_col2, #T_0502a_row101_col3, #T_0502a_row101_col4, #T_0502a_row101_col5, #T_0502a_row101_col6, #T_0502a_row101_col7, #T_0502a_row101_col8, #T_0502a_row102_col0, #T_0502a_row102_col1, #T_0502a_row102_col2, #T_0502a_row102_col3, #T_0502a_row102_col4, #T_0502a_row102_col5, #T_0502a_row102_col6, #T_0502a_row102_col7, #T_0502a_row102_col8, #T_0502a_row103_col0, #T_0502a_row103_col1, #T_0502a_row103_col2, #T_0502a_row103_col3, #T_0502a_row103_col4, #T_0502a_row103_col5, #T_0502a_row103_col6, #T_0502a_row103_col7, #T_0502a_row103_col8, #T_0502a_row104_col0, #T_0502a_row104_col1, #T_0502a_row104_col2, #T_0502a_row104_col3, #T_0502a_row104_col4, #T_0502a_row104_col5, #T_0502a_row104_col6, #T_0502a_row104_col7, #T_0502a_row104_col8, #T_0502a_row105_col0, #T_0502a_row105_col1, #T_0502a_row105_col2, #T_0502a_row105_col3, #T_0502a_row105_col4, #T_0502a_row105_col5, #T_0502a_row105_col6, #T_0502a_row105_col7, #T_0502a_row105_col8, #T_0502a_row106_col0, #T_0502a_row106_col1, #T_0502a_row106_col2, #T_0502a_row106_col3, #T_0502a_row106_col4, #T_0502a_row106_col5, #T_0502a_row106_col6, #T_0502a_row106_col7, #T_0502a_row106_col8, #T_0502a_row107_col0, #T_0502a_row107_col1, #T_0502a_row107_col2, #T_0502a_row107_col3, #T_0502a_row107_col4, #T_0502a_row107_col5, #T_0502a_row107_col6, #T_0502a_row107_col7, #T_0502a_row107_col8, #T_0502a_row108_col0, #T_0502a_row108_col1, #T_0502a_row108_col2, #T_0502a_row108_col3, #T_0502a_row108_col4, #T_0502a_row108_col5, #T_0502a_row108_col6, #T_0502a_row108_col7, #T_0502a_row108_col8, #T_0502a_row109_col0, #T_0502a_row109_col1, #T_0502a_row109_col2, #T_0502a_row109_col3, #T_0502a_row109_col4, #T_0502a_row109_col5, #T_0502a_row109_col6, #T_0502a_row109_col7, #T_0502a_row109_col8, #T_0502a_row110_col0, #T_0502a_row110_col1, #T_0502a_row110_col2, #T_0502a_row110_col3, #T_0502a_row110_col4, #T_0502a_row110_col5, #T_0502a_row110_col6, #T_0502a_row110_col7, #T_0502a_row110_col8, #T_0502a_row111_col0, #T_0502a_row111_col1, #T_0502a_row111_col2, #T_0502a_row111_col3, #T_0502a_row111_col4, #T_0502a_row111_col5, #T_0502a_row111_col6, #T_0502a_row111_col7, #T_0502a_row111_col8, #T_0502a_row112_col0, #T_0502a_row112_col1, #T_0502a_row112_col2, #T_0502a_row112_col3, #T_0502a_row112_col4, #T_0502a_row112_col5, #T_0502a_row112_col6, #T_0502a_row112_col7, #T_0502a_row112_col8, #T_0502a_row113_col0, #T_0502a_row113_col1, #T_0502a_row113_col2, #T_0502a_row113_col3, #T_0502a_row113_col4, #T_0502a_row113_col5, #T_0502a_row113_col6, #T_0502a_row113_col7, #T_0502a_row113_col8, #T_0502a_row114_col0, #T_0502a_row114_col1, #T_0502a_row114_col2, #T_0502a_row114_col3, #T_0502a_row114_col4, #T_0502a_row114_col5, #T_0502a_row114_col6, #T_0502a_row114_col7, #T_0502a_row114_col8, #T_0502a_row115_col0, #T_0502a_row115_col1, #T_0502a_row115_col2, #T_0502a_row115_col3, #T_0502a_row115_col4, #T_0502a_row115_col5, #T_0502a_row115_col6, #T_0502a_row115_col7, #T_0502a_row115_col8, #T_0502a_row116_col0, #T_0502a_row116_col1, #T_0502a_row116_col2, #T_0502a_row116_col3, #T_0502a_row116_col4, #T_0502a_row116_col5, #T_0502a_row116_col6, #T_0502a_row116_col7, #T_0502a_row116_col8, #T_0502a_row117_col0, #T_0502a_row117_col1, #T_0502a_row117_col2, #T_0502a_row117_col3, #T_0502a_row117_col4, #T_0502a_row117_col5, #T_0502a_row117_col6, #T_0502a_row117_col7, #T_0502a_row117_col8, #T_0502a_row118_col0, #T_0502a_row118_col1, #T_0502a_row118_col2, #T_0502a_row118_col3, #T_0502a_row118_col4, #T_0502a_row118_col5, #T_0502a_row118_col6, #T_0502a_row118_col7, #T_0502a_row118_col8, #T_0502a_row119_col0, #T_0502a_row119_col1, #T_0502a_row119_col2, #T_0502a_row119_col3, #T_0502a_row119_col4, #T_0502a_row119_col5, #T_0502a_row119_col6, #T_0502a_row119_col7, #T_0502a_row119_col8, #T_0502a_row120_col0, #T_0502a_row120_col1, #T_0502a_row120_col2, #T_0502a_row120_col3, #T_0502a_row120_col4, #T_0502a_row120_col5, #T_0502a_row120_col6, #T_0502a_row120_col7, #T_0502a_row120_col8, #T_0502a_row121_col0, #T_0502a_row121_col1, #T_0502a_row121_col2, #T_0502a_row121_col3, #T_0502a_row121_col4, #T_0502a_row121_col5, #T_0502a_row121_col6, #T_0502a_row121_col7, #T_0502a_row121_col8, #T_0502a_row122_col0, #T_0502a_row122_col1, #T_0502a_row122_col2, #T_0502a_row122_col3, #T_0502a_row122_col4, #T_0502a_row122_col5, #T_0502a_row122_col6, #T_0502a_row122_col7, #T_0502a_row122_col8, #T_0502a_row123_col0, #T_0502a_row123_col1, #T_0502a_row123_col2, #T_0502a_row123_col3, #T_0502a_row123_col4, #T_0502a_row123_col5, #T_0502a_row123_col6, #T_0502a_row123_col7, #T_0502a_row123_col8, #T_0502a_row124_col0, #T_0502a_row124_col1, #T_0502a_row124_col2, #T_0502a_row124_col3, #T_0502a_row124_col4, #T_0502a_row124_col5, #T_0502a_row124_col6, #T_0502a_row124_col7, #T_0502a_row124_col8, #T_0502a_row125_col0, #T_0502a_row125_col1, #T_0502a_row125_col2, #T_0502a_row125_col3, #T_0502a_row125_col4, #T_0502a_row125_col5, #T_0502a_row125_col6, #T_0502a_row125_col7, #T_0502a_row125_col8, #T_0502a_row126_col0, #T_0502a_row126_col1, #T_0502a_row126_col2, #T_0502a_row126_col3, #T_0502a_row126_col4, #T_0502a_row126_col5, #T_0502a_row126_col6, #T_0502a_row126_col7, #T_0502a_row126_col8, #T_0502a_row127_col0, #T_0502a_row127_col1, #T_0502a_row127_col2, #T_0502a_row127_col3, #T_0502a_row127_col4, #T_0502a_row127_col5, #T_0502a_row127_col6, #T_0502a_row127_col7, #T_0502a_row127_col8, #T_0502a_row128_col0, #T_0502a_row128_col1, #T_0502a_row128_col2, #T_0502a_row128_col3, #T_0502a_row128_col4, #T_0502a_row128_col5, #T_0502a_row128_col6, #T_0502a_row128_col7, #T_0502a_row128_col8, #T_0502a_row129_col0, #T_0502a_row129_col1, #T_0502a_row129_col2, #T_0502a_row129_col3, #T_0502a_row129_col4, #T_0502a_row129_col5, #T_0502a_row129_col6, #T_0502a_row129_col7, #T_0502a_row129_col8, #T_0502a_row130_col0, #T_0502a_row130_col1, #T_0502a_row130_col2, #T_0502a_row130_col3, #T_0502a_row130_col4, #T_0502a_row130_col5, #T_0502a_row130_col6, #T_0502a_row130_col7, #T_0502a_row130_col8, #T_0502a_row131_col0, #T_0502a_row131_col1, #T_0502a_row131_col2, #T_0502a_row131_col3, #T_0502a_row131_col4, #T_0502a_row131_col5, #T_0502a_row131_col6, #T_0502a_row131_col7, #T_0502a_row131_col8, #T_0502a_row132_col0, #T_0502a_row132_col1, #T_0502a_row132_col2, #T_0502a_row132_col3, #T_0502a_row132_col4, #T_0502a_row132_col5, #T_0502a_row132_col6, #T_0502a_row132_col7, #T_0502a_row132_col8, #T_0502a_row133_col0, #T_0502a_row133_col1, #T_0502a_row133_col2, #T_0502a_row133_col3, #T_0502a_row133_col4, #T_0502a_row133_col5, #T_0502a_row133_col6, #T_0502a_row133_col7, #T_0502a_row133_col8, #T_0502a_row134_col0, #T_0502a_row134_col1, #T_0502a_row134_col2, #T_0502a_row134_col3, #T_0502a_row134_col4, #T_0502a_row134_col5, #T_0502a_row134_col6, #T_0502a_row134_col7, #T_0502a_row134_col8, #T_0502a_row135_col0, #T_0502a_row135_col1, #T_0502a_row135_col2, #T_0502a_row135_col3, #T_0502a_row135_col4, #T_0502a_row135_col5, #T_0502a_row135_col6, #T_0502a_row135_col7, #T_0502a_row135_col8, #T_0502a_row136_col0, #T_0502a_row136_col1, #T_0502a_row136_col2, #T_0502a_row136_col3, #T_0502a_row136_col4, #T_0502a_row136_col5, #T_0502a_row136_col6, #T_0502a_row136_col7, #T_0502a_row136_col8, #T_0502a_row137_col0, #T_0502a_row137_col1, #T_0502a_row137_col2, #T_0502a_row137_col3, #T_0502a_row137_col4, #T_0502a_row137_col5, #T_0502a_row137_col6, #T_0502a_row137_col7, #T_0502a_row137_col8, #T_0502a_row138_col0, #T_0502a_row138_col1, #T_0502a_row138_col2, #T_0502a_row138_col3, #T_0502a_row138_col4, #T_0502a_row138_col5, #T_0502a_row138_col6, #T_0502a_row138_col7, #T_0502a_row138_col8, #T_0502a_row139_col0, #T_0502a_row139_col1, #T_0502a_row139_col2, #T_0502a_row139_col3, #T_0502a_row139_col4, #T_0502a_row139_col5, #T_0502a_row139_col6, #T_0502a_row139_col7, #T_0502a_row139_col8, #T_0502a_row140_col0, #T_0502a_row140_col1, #T_0502a_row140_col2, #T_0502a_row140_col3, #T_0502a_row140_col4, #T_0502a_row140_col5, #T_0502a_row140_col6, #T_0502a_row140_col7, #T_0502a_row140_col8, #T_0502a_row141_col0, #T_0502a_row141_col1, #T_0502a_row141_col2, #T_0502a_row141_col3, #T_0502a_row141_col4, #T_0502a_row141_col5, #T_0502a_row141_col6, #T_0502a_row141_col7, #T_0502a_row141_col8, #T_0502a_row142_col0, #T_0502a_row142_col1, #T_0502a_row142_col2, #T_0502a_row142_col3, #T_0502a_row142_col4, #T_0502a_row142_col5, #T_0502a_row142_col6, #T_0502a_row142_col7, #T_0502a_row142_col8, #T_0502a_row143_col0, #T_0502a_row143_col1, #T_0502a_row143_col2, #T_0502a_row143_col3, #T_0502a_row143_col4, #T_0502a_row143_col5, #T_0502a_row143_col6, #T_0502a_row143_col7, #T_0502a_row143_col8, #T_0502a_row144_col0, #T_0502a_row144_col1, #T_0502a_row144_col2, #T_0502a_row144_col3, #T_0502a_row144_col4, #T_0502a_row144_col5, #T_0502a_row144_col6, #T_0502a_row144_col7, #T_0502a_row144_col8, #T_0502a_row145_col0, #T_0502a_row145_col1, #T_0502a_row145_col2, #T_0502a_row145_col3, #T_0502a_row145_col4, #T_0502a_row145_col5, #T_0502a_row145_col6, #T_0502a_row145_col7, #T_0502a_row145_col8, #T_0502a_row146_col0, #T_0502a_row146_col1, #T_0502a_row146_col2, #T_0502a_row146_col3, #T_0502a_row146_col4, #T_0502a_row146_col5, #T_0502a_row146_col6, #T_0502a_row146_col7, #T_0502a_row146_col8, #T_0502a_row147_col0, #T_0502a_row147_col1, #T_0502a_row147_col2, #T_0502a_row147_col3, #T_0502a_row147_col4, #T_0502a_row147_col5, #T_0502a_row147_col6, #T_0502a_row147_col7, #T_0502a_row147_col8, #T_0502a_row148_col0, #T_0502a_row148_col1, #T_0502a_row148_col2, #T_0502a_row148_col3, #T_0502a_row148_col4, #T_0502a_row148_col5, #T_0502a_row148_col6, #T_0502a_row148_col7, #T_0502a_row148_col8, #T_0502a_row149_col0, #T_0502a_row149_col1, #T_0502a_row149_col2, #T_0502a_row149_col3, #T_0502a_row149_col4, #T_0502a_row149_col5, #T_0502a_row149_col6, #T_0502a_row149_col7, #T_0502a_row149_col8, #T_0502a_row150_col0, #T_0502a_row150_col1, #T_0502a_row150_col2, #T_0502a_row150_col3, #T_0502a_row150_col4, #T_0502a_row150_col5, #T_0502a_row150_col6, #T_0502a_row150_col7, #T_0502a_row150_col8, #T_0502a_row151_col0, #T_0502a_row151_col1, #T_0502a_row151_col2, #T_0502a_row151_col3, #T_0502a_row151_col4, #T_0502a_row151_col5, #T_0502a_row151_col6, #T_0502a_row151_col7, #T_0502a_row151_col8, #T_0502a_row152_col0, #T_0502a_row152_col1, #T_0502a_row152_col2, #T_0502a_row152_col3, #T_0502a_row152_col4, #T_0502a_row152_col5, #T_0502a_row152_col6, #T_0502a_row152_col7, #T_0502a_row152_col8, #T_0502a_row153_col0, #T_0502a_row153_col1, #T_0502a_row153_col2, #T_0502a_row153_col3, #T_0502a_row153_col4, #T_0502a_row153_col5, #T_0502a_row153_col6, #T_0502a_row153_col7, #T_0502a_row153_col8, #T_0502a_row154_col0, #T_0502a_row154_col1, #T_0502a_row154_col2, #T_0502a_row154_col3, #T_0502a_row154_col4, #T_0502a_row154_col5, #T_0502a_row154_col6, #T_0502a_row154_col7, #T_0502a_row154_col8, #T_0502a_row155_col0, #T_0502a_row155_col1, #T_0502a_row155_col2, #T_0502a_row155_col3, #T_0502a_row155_col4, #T_0502a_row155_col5, #T_0502a_row155_col6, #T_0502a_row155_col7, #T_0502a_row155_col8, #T_0502a_row156_col0, #T_0502a_row156_col1, #T_0502a_row156_col2, #T_0502a_row156_col3, #T_0502a_row156_col4, #T_0502a_row156_col5, #T_0502a_row156_col6, #T_0502a_row156_col7, #T_0502a_row156_col8, #T_0502a_row157_col0, #T_0502a_row157_col1, #T_0502a_row157_col2, #T_0502a_row157_col3, #T_0502a_row157_col4, #T_0502a_row157_col5, #T_0502a_row157_col6, #T_0502a_row157_col7, #T_0502a_row157_col8, #T_0502a_row158_col0, #T_0502a_row158_col1, #T_0502a_row158_col2, #T_0502a_row158_col3, #T_0502a_row158_col4, #T_0502a_row158_col5, #T_0502a_row158_col6, #T_0502a_row158_col7, #T_0502a_row158_col8, #T_0502a_row159_col0, #T_0502a_row159_col1, #T_0502a_row159_col2, #T_0502a_row159_col3, #T_0502a_row159_col4, #T_0502a_row159_col5, #T_0502a_row159_col6, #T_0502a_row159_col7, #T_0502a_row159_col8, #T_0502a_row160_col0, #T_0502a_row160_col1, #T_0502a_row160_col2, #T_0502a_row160_col3, #T_0502a_row160_col4, #T_0502a_row160_col5, #T_0502a_row160_col6, #T_0502a_row160_col7, #T_0502a_row160_col8, #T_0502a_row161_col0, #T_0502a_row161_col1, #T_0502a_row161_col2, #T_0502a_row161_col3, #T_0502a_row161_col4, #T_0502a_row161_col5, #T_0502a_row161_col6, #T_0502a_row161_col7, #T_0502a_row161_col8, #T_0502a_row162_col0, #T_0502a_row162_col1, #T_0502a_row162_col2, #T_0502a_row162_col3, #T_0502a_row162_col4, #T_0502a_row162_col5, #T_0502a_row162_col6, #T_0502a_row162_col7, #T_0502a_row162_col8, #T_0502a_row163_col0, #T_0502a_row163_col1, #T_0502a_row163_col2, #T_0502a_row163_col3, #T_0502a_row163_col4, #T_0502a_row163_col5, #T_0502a_row163_col6, #T_0502a_row163_col7, #T_0502a_row163_col8, #T_0502a_row164_col0, #T_0502a_row164_col1, #T_0502a_row164_col2, #T_0502a_row164_col3, #T_0502a_row164_col4, #T_0502a_row164_col5, #T_0502a_row164_col6, #T_0502a_row164_col7, #T_0502a_row164_col8, #T_0502a_row165_col0, #T_0502a_row165_col1, #T_0502a_row165_col2, #T_0502a_row165_col3, #T_0502a_row165_col4, #T_0502a_row165_col5, #T_0502a_row165_col6, #T_0502a_row165_col7, #T_0502a_row165_col8, #T_0502a_row166_col0, #T_0502a_row166_col1, #T_0502a_row166_col2, #T_0502a_row166_col3, #T_0502a_row166_col4, #T_0502a_row166_col5, #T_0502a_row166_col6, #T_0502a_row166_col7, #T_0502a_row166_col8, #T_0502a_row167_col0, #T_0502a_row167_col1, #T_0502a_row167_col2, #T_0502a_row167_col3, #T_0502a_row167_col4, #T_0502a_row167_col5, #T_0502a_row167_col6, #T_0502a_row167_col7, #T_0502a_row167_col8, #T_0502a_row168_col0, #T_0502a_row168_col1, #T_0502a_row168_col2, #T_0502a_row168_col3, #T_0502a_row168_col4, #T_0502a_row168_col5, #T_0502a_row168_col6, #T_0502a_row168_col7, #T_0502a_row168_col8, #T_0502a_row169_col0, #T_0502a_row169_col1, #T_0502a_row169_col2, #T_0502a_row169_col3, #T_0502a_row169_col4, #T_0502a_row169_col5, #T_0502a_row169_col6, #T_0502a_row169_col7, #T_0502a_row169_col8, #T_0502a_row170_col0, #T_0502a_row170_col1, #T_0502a_row170_col2, #T_0502a_row170_col3, #T_0502a_row170_col4, #T_0502a_row170_col5, #T_0502a_row170_col6, #T_0502a_row170_col7, #T_0502a_row170_col8, #T_0502a_row171_col0, #T_0502a_row171_col1, #T_0502a_row171_col2, #T_0502a_row171_col3, #T_0502a_row171_col4, #T_0502a_row171_col5, #T_0502a_row171_col6, #T_0502a_row171_col7, #T_0502a_row171_col8, #T_0502a_row172_col0, #T_0502a_row172_col1, #T_0502a_row172_col2, #T_0502a_row172_col3, #T_0502a_row172_col4, #T_0502a_row172_col5, #T_0502a_row172_col6, #T_0502a_row172_col7, #T_0502a_row172_col8, #T_0502a_row173_col0, #T_0502a_row173_col1, #T_0502a_row173_col2, #T_0502a_row173_col3, #T_0502a_row173_col4, #T_0502a_row173_col5, #T_0502a_row173_col6, #T_0502a_row173_col7, #T_0502a_row173_col8, #T_0502a_row174_col0, #T_0502a_row174_col1, #T_0502a_row174_col2, #T_0502a_row174_col3, #T_0502a_row174_col4, #T_0502a_row174_col5, #T_0502a_row174_col6, #T_0502a_row174_col7, #T_0502a_row174_col8, #T_0502a_row175_col0, #T_0502a_row175_col1, #T_0502a_row175_col2, #T_0502a_row175_col3, #T_0502a_row175_col4, #T_0502a_row175_col5, #T_0502a_row175_col6, #T_0502a_row175_col7, #T_0502a_row175_col8, #T_0502a_row176_col0, #T_0502a_row176_col1, #T_0502a_row176_col2, #T_0502a_row176_col3, #T_0502a_row176_col4, #T_0502a_row176_col5, #T_0502a_row176_col6, #T_0502a_row176_col7, #T_0502a_row176_col8, #T_0502a_row177_col0, #T_0502a_row177_col1, #T_0502a_row177_col2, #T_0502a_row177_col3, #T_0502a_row177_col4, #T_0502a_row177_col5, #T_0502a_row177_col6, #T_0502a_row177_col7, #T_0502a_row177_col8, #T_0502a_row178_col0, #T_0502a_row178_col1, #T_0502a_row178_col2, #T_0502a_row178_col3, #T_0502a_row178_col4, #T_0502a_row178_col5, #T_0502a_row178_col6, #T_0502a_row178_col7, #T_0502a_row178_col8, #T_0502a_row179_col0, #T_0502a_row179_col1, #T_0502a_row179_col2, #T_0502a_row179_col3, #T_0502a_row179_col4, #T_0502a_row179_col5, #T_0502a_row179_col6, #T_0502a_row179_col7, #T_0502a_row179_col8, #T_0502a_row180_col0, #T_0502a_row180_col1, #T_0502a_row180_col2, #T_0502a_row180_col3, #T_0502a_row180_col4, #T_0502a_row180_col5, #T_0502a_row180_col6, #T_0502a_row180_col7, #T_0502a_row180_col8, #T_0502a_row181_col0, #T_0502a_row181_col1, #T_0502a_row181_col2, #T_0502a_row181_col3, #T_0502a_row181_col4, #T_0502a_row181_col5, #T_0502a_row181_col6, #T_0502a_row181_col7, #T_0502a_row181_col8, #T_0502a_row182_col0, #T_0502a_row182_col1, #T_0502a_row182_col2, #T_0502a_row182_col3, #T_0502a_row182_col4, #T_0502a_row182_col5, #T_0502a_row182_col6, #T_0502a_row182_col7, #T_0502a_row182_col8, #T_0502a_row183_col0, #T_0502a_row183_col1, #T_0502a_row183_col2, #T_0502a_row183_col3, #T_0502a_row183_col4, #T_0502a_row183_col5, #T_0502a_row183_col6, #T_0502a_row183_col7, #T_0502a_row183_col8, #T_0502a_row184_col0, #T_0502a_row184_col1, #T_0502a_row184_col2, #T_0502a_row184_col3, #T_0502a_row184_col4, #T_0502a_row184_col5, #T_0502a_row184_col6, #T_0502a_row184_col7, #T_0502a_row184_col8, #T_0502a_row185_col0, #T_0502a_row185_col1, #T_0502a_row185_col2, #T_0502a_row185_col3, #T_0502a_row185_col4, #T_0502a_row185_col5, #T_0502a_row185_col6, #T_0502a_row185_col7, #T_0502a_row185_col8, #T_0502a_row186_col0, #T_0502a_row186_col1, #T_0502a_row186_col2, #T_0502a_row186_col3, #T_0502a_row186_col4, #T_0502a_row186_col5, #T_0502a_row186_col6, #T_0502a_row186_col7, #T_0502a_row186_col8, #T_0502a_row187_col0, #T_0502a_row187_col1, #T_0502a_row187_col2, #T_0502a_row187_col3, #T_0502a_row187_col4, #T_0502a_row187_col5, #T_0502a_row187_col6, #T_0502a_row187_col7, #T_0502a_row187_col8, #T_0502a_row188_col0, #T_0502a_row188_col1, #T_0502a_row188_col2, #T_0502a_row188_col3, #T_0502a_row188_col4, #T_0502a_row188_col5, #T_0502a_row188_col6, #T_0502a_row188_col7, #T_0502a_row188_col8, #T_0502a_row189_col0, #T_0502a_row189_col1, #T_0502a_row189_col2, #T_0502a_row189_col3, #T_0502a_row189_col4, #T_0502a_row189_col5, #T_0502a_row189_col6, #T_0502a_row189_col7, #T_0502a_row189_col8, #T_0502a_row190_col0, #T_0502a_row190_col1, #T_0502a_row190_col2, #T_0502a_row190_col3, #T_0502a_row190_col4, #T_0502a_row190_col5, #T_0502a_row190_col6, #T_0502a_row190_col7, #T_0502a_row190_col8, #T_0502a_row191_col0, #T_0502a_row191_col1, #T_0502a_row191_col2, #T_0502a_row191_col3, #T_0502a_row191_col4, #T_0502a_row191_col5, #T_0502a_row191_col6, #T_0502a_row191_col7, #T_0502a_row191_col8, #T_0502a_row192_col0, #T_0502a_row192_col1, #T_0502a_row192_col2, #T_0502a_row192_col3, #T_0502a_row192_col4, #T_0502a_row192_col5, #T_0502a_row192_col6, #T_0502a_row192_col7, #T_0502a_row192_col8, #T_0502a_row193_col0, #T_0502a_row193_col1, #T_0502a_row193_col2, #T_0502a_row193_col3, #T_0502a_row193_col4, #T_0502a_row193_col5, #T_0502a_row193_col6, #T_0502a_row193_col7, #T_0502a_row193_col8, #T_0502a_row194_col0, #T_0502a_row194_col1, #T_0502a_row194_col2, #T_0502a_row194_col3, #T_0502a_row194_col4, #T_0502a_row194_col5, #T_0502a_row194_col6, #T_0502a_row194_col7, #T_0502a_row194_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_0502a\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_0502a_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_0502a_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_0502a_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_0502a_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_0502a_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_0502a_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_0502a_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_0502a_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_0502a_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_0502a_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.ACFandPACFPlot</td>\n", + " <td id=\"T_0502a_row0_col1\" class=\"data row0 col1\" >AC Fand PACF Plot</td>\n", + " <td id=\"T_0502a_row0_col2\" class=\"data row0 col2\" >Analyzes time series data using Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to...</td>\n", + " <td id=\"T_0502a_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_0502a_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_0502a_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_0502a_row0_col7\" class=\"data row0 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'visualization']</td>\n", + " <td id=\"T_0502a_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ADF</td>\n", + " <td id=\"T_0502a_row1_col1\" class=\"data row1 col1\" >ADF</td>\n", + " <td id=\"T_0502a_row1_col2\" class=\"data row1 col2\" >Assesses the stationarity of a time series dataset using the Augmented Dickey-Fuller (ADF) test....</td>\n", + " <td id=\"T_0502a_row1_col3\" class=\"data row1 col3\" >False</td>\n", + " <td id=\"T_0502a_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_0502a_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row1_col6\" class=\"data row1 col6\" >{}</td>\n", + " <td id=\"T_0502a_row1_col7\" class=\"data row1 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test', 'stationarity']</td>\n", + " <td id=\"T_0502a_row1_col8\" class=\"data row1 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.AutoAR</td>\n", + " <td id=\"T_0502a_row2_col1\" class=\"data row2 col1\" >Auto AR</td>\n", + " <td id=\"T_0502a_row2_col2\" class=\"data row2 col2\" >Automatically identifies the optimal Autoregressive (AR) order for a time series using BIC and AIC criteria....</td>\n", + " <td id=\"T_0502a_row2_col3\" class=\"data row2 col3\" >False</td>\n", + " <td id=\"T_0502a_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_0502a_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row2_col6\" class=\"data row2 col6\" >{'max_ar_order': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row2_col7\" class=\"data row2 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row2_col8\" class=\"data row2 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.AutoMA</td>\n", + " <td id=\"T_0502a_row3_col1\" class=\"data row3 col1\" >Auto MA</td>\n", + " <td id=\"T_0502a_row3_col2\" class=\"data row3 col2\" >Automatically selects the optimal Moving Average (MA) order for each variable in a time series dataset based on...</td>\n", + " <td id=\"T_0502a_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_0502a_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_0502a_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row3_col6\" class=\"data row3 col6\" >{'max_ma_order': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row3_col7\" class=\"data row3 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row3_col8\" class=\"data row3 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.AutoStationarity</td>\n", + " <td id=\"T_0502a_row4_col1\" class=\"data row4 col1\" >Auto Stationarity</td>\n", + " <td id=\"T_0502a_row4_col2\" class=\"data row4 col2\" >Automates Augmented Dickey-Fuller test to assess stationarity across multiple time series in a DataFrame....</td>\n", + " <td id=\"T_0502a_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_0502a_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_0502a_row4_col5\" class=\"data row4 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row4_col6\" class=\"data row4 col6\" >{'max_order': {'type': 'int', 'default': 5}, 'threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row4_col7\" class=\"data row4 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row4_col8\" class=\"data row4 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", + " <td id=\"T_0502a_row5_col1\" class=\"data row5 col1\" >Bivariate Scatter Plots</td>\n", + " <td id=\"T_0502a_row5_col2\" class=\"data row5 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", + " <td id=\"T_0502a_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_0502a_row5_col4\" class=\"data row5 col4\" >False</td>\n", + " <td id=\"T_0502a_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_0502a_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row5_col8\" class=\"data row5 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.BoxPierce</td>\n", + " <td id=\"T_0502a_row6_col1\" class=\"data row6 col1\" >Box Pierce</td>\n", + " <td id=\"T_0502a_row6_col2\" class=\"data row6 col2\" >Detects autocorrelation in time-series data through the Box-Pierce test to validate model performance....</td>\n", + " <td id=\"T_0502a_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_0502a_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_0502a_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row6_col6\" class=\"data row6 col6\" >{}</td>\n", + " <td id=\"T_0502a_row6_col7\" class=\"data row6 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row6_col8\" class=\"data row6 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", + " <td id=\"T_0502a_row7_col1\" class=\"data row7 col1\" >Chi Squared Features Table</td>\n", + " <td id=\"T_0502a_row7_col2\" class=\"data row7 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", + " <td id=\"T_0502a_row7_col3\" class=\"data row7 col3\" >False</td>\n", + " <td id=\"T_0502a_row7_col4\" class=\"data row7 col4\" >True</td>\n", + " <td id=\"T_0502a_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row7_col6\" class=\"data row7 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row7_col8\" class=\"data row7 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_0502a_row8_col1\" class=\"data row8 col1\" >Class Imbalance</td>\n", + " <td id=\"T_0502a_row8_col2\" class=\"data row8 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", + " <td id=\"T_0502a_row8_col3\" class=\"data row8 col3\" >True</td>\n", + " <td id=\"T_0502a_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_0502a_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row8_col6\" class=\"data row8 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", + " <td id=\"T_0502a_row8_col8\" class=\"data row8 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_0502a_row9_col1\" class=\"data row9 col1\" >Dataset Description</td>\n", + " <td id=\"T_0502a_row9_col2\" class=\"data row9 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", + " <td id=\"T_0502a_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_0502a_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_0502a_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row9_col6\" class=\"data row9 col6\" >{}</td>\n", + " <td id=\"T_0502a_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_0502a_row9_col8\" class=\"data row9 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_0502a_row10_col1\" class=\"data row10 col1\" >Dataset Split</td>\n", + " <td id=\"T_0502a_row10_col2\" class=\"data row10 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", + " <td id=\"T_0502a_row10_col3\" class=\"data row10 col3\" >False</td>\n", + " <td id=\"T_0502a_row10_col4\" class=\"data row10 col4\" >True</td>\n", + " <td id=\"T_0502a_row10_col5\" class=\"data row10 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row10_col6\" class=\"data row10 col6\" >{}</td>\n", + " <td id=\"T_0502a_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_0502a_row10_col8\" class=\"data row10 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_0502a_row11_col1\" class=\"data row11 col1\" >Descriptive Statistics</td>\n", + " <td id=\"T_0502a_row11_col2\" class=\"data row11 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", + " <td id=\"T_0502a_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_0502a_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_0502a_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row11_col6\" class=\"data row11 col6\" >{}</td>\n", + " <td id=\"T_0502a_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", + " <td id=\"T_0502a_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.DickeyFullerGLS</td>\n", + " <td id=\"T_0502a_row12_col1\" class=\"data row12 col1\" >Dickey Fuller GLS</td>\n", + " <td id=\"T_0502a_row12_col2\" class=\"data row12 col2\" >Assesses stationarity in time series data using the Dickey-Fuller GLS test to determine the order of integration....</td>\n", + " <td id=\"T_0502a_row12_col3\" class=\"data row12 col3\" >False</td>\n", + " <td id=\"T_0502a_row12_col4\" class=\"data row12 col4\" >True</td>\n", + " <td id=\"T_0502a_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row12_col6\" class=\"data row12 col6\" >{}</td>\n", + " <td id=\"T_0502a_row12_col7\" class=\"data row12 col7\" >['time_series_data', 'forecasting', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row12_col8\" class=\"data row12 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_0502a_row13_col1\" class=\"data row13 col1\" >Duplicates</td>\n", + " <td id=\"T_0502a_row13_col2\" class=\"data row13 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", + " <td id=\"T_0502a_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_0502a_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_0502a_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row13_col6\" class=\"data row13 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", + " <td id=\"T_0502a_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.EngleGrangerCoint</td>\n", + " <td id=\"T_0502a_row14_col1\" class=\"data row14 col1\" >Engle Granger Coint</td>\n", + " <td id=\"T_0502a_row14_col2\" class=\"data row14 col2\" >Assesses the degree of co-movement between pairs of time series data using the Engle-Granger cointegration test....</td>\n", + " <td id=\"T_0502a_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_0502a_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_0502a_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row14_col6\" class=\"data row14 col6\" >{'threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row14_col7\" class=\"data row14 col7\" >['time_series_data', 'statistical_test', 'forecasting']</td>\n", + " <td id=\"T_0502a_row14_col8\" class=\"data row14 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", + " <td id=\"T_0502a_row15_col1\" class=\"data row15 col1\" >Feature Target Correlation Plot</td>\n", + " <td id=\"T_0502a_row15_col2\" class=\"data row15 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", + " <td id=\"T_0502a_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_0502a_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_0502a_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row15_col6\" class=\"data row15 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", + " <td id=\"T_0502a_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", + " <td id=\"T_0502a_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_0502a_row16_col1\" class=\"data row16 col1\" >High Cardinality</td>\n", + " <td id=\"T_0502a_row16_col2\" class=\"data row16 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", + " <td id=\"T_0502a_row16_col3\" class=\"data row16 col3\" >False</td>\n", + " <td id=\"T_0502a_row16_col4\" class=\"data row16 col4\" >True</td>\n", + " <td id=\"T_0502a_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row16_col6\" class=\"data row16 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", + " <td id=\"T_0502a_row16_col7\" class=\"data row16 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_0502a_row17_col1\" class=\"data row17 col1\" >High Pearson Correlation</td>\n", + " <td id=\"T_0502a_row17_col2\" class=\"data row17 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", + " <td id=\"T_0502a_row17_col3\" class=\"data row17 col3\" >False</td>\n", + " <td id=\"T_0502a_row17_col4\" class=\"data row17 col4\" >True</td>\n", + " <td id=\"T_0502a_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row17_col6\" class=\"data row17 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", + " <td id=\"T_0502a_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", + " <td id=\"T_0502a_row18_col1\" class=\"data row18 col1\" >IQR Outliers Bar Plot</td>\n", + " <td id=\"T_0502a_row18_col2\" class=\"data row18 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", + " <td id=\"T_0502a_row18_col3\" class=\"data row18 col3\" >True</td>\n", + " <td id=\"T_0502a_row18_col4\" class=\"data row18 col4\" >False</td>\n", + " <td id=\"T_0502a_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row18_col6\" class=\"data row18 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", + " <td id=\"T_0502a_row18_col7\" class=\"data row18 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", + " <td id=\"T_0502a_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.IQROutliersTable</td>\n", + " <td id=\"T_0502a_row19_col1\" class=\"data row19 col1\" >IQR Outliers Table</td>\n", + " <td id=\"T_0502a_row19_col2\" class=\"data row19 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", + " <td id=\"T_0502a_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_0502a_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_0502a_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row19_col6\" class=\"data row19 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", + " <td id=\"T_0502a_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'numerical_data']</td>\n", + " <td id=\"T_0502a_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", + " <td id=\"T_0502a_row20_col1\" class=\"data row20 col1\" >Isolation Forest Outliers</td>\n", + " <td id=\"T_0502a_row20_col2\" class=\"data row20 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", + " <td id=\"T_0502a_row20_col3\" class=\"data row20 col3\" >True</td>\n", + " <td id=\"T_0502a_row20_col4\" class=\"data row20 col4\" >False</td>\n", + " <td id=\"T_0502a_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row20_col6\" class=\"data row20 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'anomaly_detection']</td>\n", + " <td id=\"T_0502a_row20_col8\" class=\"data row20 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.JarqueBera</td>\n", + " <td id=\"T_0502a_row21_col1\" class=\"data row21 col1\" >Jarque Bera</td>\n", + " <td id=\"T_0502a_row21_col2\" class=\"data row21 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", + " <td id=\"T_0502a_row21_col3\" class=\"data row21 col3\" >False</td>\n", + " <td id=\"T_0502a_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_0502a_row21_col5\" class=\"data row21 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row21_col6\" class=\"data row21 col6\" >{}</td>\n", + " <td id=\"T_0502a_row21_col7\" class=\"data row21 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.KPSS</td>\n", + " <td id=\"T_0502a_row22_col1\" class=\"data row22 col1\" >KPSS</td>\n", + " <td id=\"T_0502a_row22_col2\" class=\"data row22 col2\" >Assesses the stationarity of time-series data in a machine learning model using the KPSS unit root test....</td>\n", + " <td id=\"T_0502a_row22_col3\" class=\"data row22 col3\" >False</td>\n", + " <td id=\"T_0502a_row22_col4\" class=\"data row22 col4\" >True</td>\n", + " <td id=\"T_0502a_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row22_col6\" class=\"data row22 col6\" >{}</td>\n", + " <td id=\"T_0502a_row22_col7\" class=\"data row22 col7\" >['time_series_data', 'stationarity', 'unit_root_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row22_col8\" class=\"data row22 col8\" >['data_validation']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.LJungBox</td>\n", + " <td id=\"T_0502a_row23_col1\" class=\"data row23 col1\" >L Jung Box</td>\n", + " <td id=\"T_0502a_row23_col2\" class=\"data row23 col2\" >Assesses autocorrelations in dataset features by performing a Ljung-Box test on each feature....</td>\n", + " <td id=\"T_0502a_row23_col3\" class=\"data row23 col3\" >False</td>\n", + " <td id=\"T_0502a_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_0502a_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row23_col6\" class=\"data row23 col6\" >{}</td>\n", + " <td id=\"T_0502a_row23_col7\" class=\"data row23 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row23_col8\" class=\"data row23 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.LaggedCorrelationHeatmap</td>\n", + " <td id=\"T_0502a_row24_col1\" class=\"data row24 col1\" >Lagged Correlation Heatmap</td>\n", + " <td id=\"T_0502a_row24_col2\" class=\"data row24 col2\" >Assesses and visualizes correlation between target variable and lagged independent variables in a time-series...</td>\n", + " <td id=\"T_0502a_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_0502a_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_0502a_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row24_col6\" class=\"data row24 col6\" >{'num_lags': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row24_col7\" class=\"data row24 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row24_col8\" class=\"data row24 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_0502a_row25_col1\" class=\"data row25 col1\" >Missing Values</td>\n", + " <td id=\"T_0502a_row25_col2\" class=\"data row25 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", + " <td id=\"T_0502a_row25_col3\" class=\"data row25 col3\" >False</td>\n", + " <td id=\"T_0502a_row25_col4\" class=\"data row25 col4\" >True</td>\n", + " <td id=\"T_0502a_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row25_col6\" class=\"data row25 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row25_col7\" class=\"data row25 col7\" >['tabular_data', 'data_quality']</td>\n", + " <td id=\"T_0502a_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", + " <td id=\"T_0502a_row26_col1\" class=\"data row26 col1\" >Missing Values Bar Plot</td>\n", + " <td id=\"T_0502a_row26_col2\" class=\"data row26 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", + " <td id=\"T_0502a_row26_col3\" class=\"data row26 col3\" >True</td>\n", + " <td id=\"T_0502a_row26_col4\" class=\"data row26 col4\" >False</td>\n", + " <td id=\"T_0502a_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row26_col6\" class=\"data row26 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", + " <td id=\"T_0502a_row26_col7\" class=\"data row26 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", + " <td id=\"T_0502a_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.MutualInformation</td>\n", + " <td id=\"T_0502a_row27_col1\" class=\"data row27 col1\" >Mutual Information</td>\n", + " <td id=\"T_0502a_row27_col2\" class=\"data row27 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", + " <td id=\"T_0502a_row27_col3\" class=\"data row27 col3\" >True</td>\n", + " <td id=\"T_0502a_row27_col4\" class=\"data row27 col4\" >False</td>\n", + " <td id=\"T_0502a_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row27_col6\" class=\"data row27 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", + " <td id=\"T_0502a_row27_col7\" class=\"data row27 col7\" >['feature_selection', 'data_analysis']</td>\n", + " <td id=\"T_0502a_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_0502a_row28_col1\" class=\"data row28 col1\" >Pearson Correlation Matrix</td>\n", + " <td id=\"T_0502a_row28_col2\" class=\"data row28 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", + " <td id=\"T_0502a_row28_col3\" class=\"data row28 col3\" >True</td>\n", + " <td id=\"T_0502a_row28_col4\" class=\"data row28 col4\" >False</td>\n", + " <td id=\"T_0502a_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_0502a_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", + " <td id=\"T_0502a_row28_col8\" class=\"data row28 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.PhillipsPerronArch</td>\n", + " <td id=\"T_0502a_row29_col1\" class=\"data row29 col1\" >Phillips Perron Arch</td>\n", + " <td id=\"T_0502a_row29_col2\" class=\"data row29 col2\" >Assesses the stationarity of time series data in each feature of the ML model using the Phillips-Perron test....</td>\n", + " <td id=\"T_0502a_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_0502a_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_0502a_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row29_col6\" class=\"data row29 col6\" >{}</td>\n", + " <td id=\"T_0502a_row29_col7\" class=\"data row29 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", + " <td id=\"T_0502a_row30_col1\" class=\"data row30 col1\" >Protected Classes Description</td>\n", + " <td id=\"T_0502a_row30_col2\" class=\"data row30 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", + " <td id=\"T_0502a_row30_col3\" class=\"data row30 col3\" >True</td>\n", + " <td id=\"T_0502a_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_0502a_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row30_col6\" class=\"data row30 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row30_col7\" class=\"data row30 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", + " <td id=\"T_0502a_row30_col8\" class=\"data row30 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.RollingStatsPlot</td>\n", + " <td id=\"T_0502a_row31_col1\" class=\"data row31 col1\" >Rolling Stats Plot</td>\n", + " <td id=\"T_0502a_row31_col2\" class=\"data row31 col2\" >Evaluates the stationarity of time series data by plotting its rolling mean and standard deviation over a specified...</td>\n", + " <td id=\"T_0502a_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_0502a_row31_col4\" class=\"data row31 col4\" >False</td>\n", + " <td id=\"T_0502a_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row31_col6\" class=\"data row31 col6\" >{'window_size': {'type': 'int', 'default': 12}}</td>\n", + " <td id=\"T_0502a_row31_col7\" class=\"data row31 col7\" >['time_series_data', 'visualization', 'stationarity']</td>\n", + " <td id=\"T_0502a_row31_col8\" class=\"data row31 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.RunsTest</td>\n", + " <td id=\"T_0502a_row32_col1\" class=\"data row32 col1\" >Runs Test</td>\n", + " <td id=\"T_0502a_row32_col2\" class=\"data row32 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", + " <td id=\"T_0502a_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_0502a_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_0502a_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row32_col6\" class=\"data row32 col6\" >{}</td>\n", + " <td id=\"T_0502a_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row32_col8\" class=\"data row32 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row33_col0\" class=\"data row33 col0\" >validmind.data_validation.ScatterPlot</td>\n", + " <td id=\"T_0502a_row33_col1\" class=\"data row33 col1\" >Scatter Plot</td>\n", + " <td id=\"T_0502a_row33_col2\" class=\"data row33 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", + " <td id=\"T_0502a_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_0502a_row33_col4\" class=\"data row33 col4\" >False</td>\n", + " <td id=\"T_0502a_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row33_col6\" class=\"data row33 col6\" >{}</td>\n", + " <td id=\"T_0502a_row33_col7\" class=\"data row33 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row33_col8\" class=\"data row33 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row34_col0\" class=\"data row34 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", + " <td id=\"T_0502a_row34_col1\" class=\"data row34 col1\" >Score Band Default Rates</td>\n", + " <td id=\"T_0502a_row34_col2\" class=\"data row34 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", + " <td id=\"T_0502a_row34_col3\" class=\"data row34 col3\" >False</td>\n", + " <td id=\"T_0502a_row34_col4\" class=\"data row34 col4\" >True</td>\n", + " <td id=\"T_0502a_row34_col5\" class=\"data row34 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row34_col6\" class=\"data row34 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row34_col7\" class=\"data row34 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_0502a_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row35_col0\" class=\"data row35 col0\" >validmind.data_validation.SeasonalDecompose</td>\n", + " <td id=\"T_0502a_row35_col1\" class=\"data row35 col1\" >Seasonal Decompose</td>\n", + " <td id=\"T_0502a_row35_col2\" class=\"data row35 col2\" >Assesses patterns and seasonality in a time series dataset by decomposing its features into foundational components....</td>\n", + " <td id=\"T_0502a_row35_col3\" class=\"data row35 col3\" >True</td>\n", + " <td id=\"T_0502a_row35_col4\" class=\"data row35 col4\" >False</td>\n", + " <td id=\"T_0502a_row35_col5\" class=\"data row35 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row35_col6\" class=\"data row35 col6\" >{'seasonal_model': {'type': 'str', 'default': 'additive'}}</td>\n", + " <td id=\"T_0502a_row35_col7\" class=\"data row35 col7\" >['time_series_data', 'seasonality', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row35_col8\" class=\"data row35 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row36_col0\" class=\"data row36 col0\" >validmind.data_validation.ShapiroWilk</td>\n", + " <td id=\"T_0502a_row36_col1\" class=\"data row36 col1\" >Shapiro Wilk</td>\n", + " <td id=\"T_0502a_row36_col2\" class=\"data row36 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", + " <td id=\"T_0502a_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_0502a_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_0502a_row36_col5\" class=\"data row36 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row36_col6\" class=\"data row36 col6\" >{}</td>\n", + " <td id=\"T_0502a_row36_col7\" class=\"data row36 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row36_col8\" class=\"data row36 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row37_col0\" class=\"data row37 col0\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_0502a_row37_col1\" class=\"data row37 col1\" >Skewness</td>\n", + " <td id=\"T_0502a_row37_col2\" class=\"data row37 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", + " <td id=\"T_0502a_row37_col3\" class=\"data row37 col3\" >False</td>\n", + " <td id=\"T_0502a_row37_col4\" class=\"data row37 col4\" >True</td>\n", + " <td id=\"T_0502a_row37_col5\" class=\"data row37 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row37_col6\" class=\"data row37 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row37_col7\" class=\"data row37 col7\" >['data_quality', 'tabular_data']</td>\n", + " <td id=\"T_0502a_row37_col8\" class=\"data row37 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row38_col0\" class=\"data row38 col0\" >validmind.data_validation.SpreadPlot</td>\n", + " <td id=\"T_0502a_row38_col1\" class=\"data row38 col1\" >Spread Plot</td>\n", + " <td id=\"T_0502a_row38_col2\" class=\"data row38 col2\" >Assesses potential correlations between pairs of time series variables through visualization to enhance...</td>\n", + " <td id=\"T_0502a_row38_col3\" class=\"data row38 col3\" >True</td>\n", + " <td id=\"T_0502a_row38_col4\" class=\"data row38 col4\" >False</td>\n", + " <td id=\"T_0502a_row38_col5\" class=\"data row38 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row38_col6\" class=\"data row38 col6\" >{}</td>\n", + " <td id=\"T_0502a_row38_col7\" class=\"data row38 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row38_col8\" class=\"data row38 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row39_col0\" class=\"data row39 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", + " <td id=\"T_0502a_row39_col1\" class=\"data row39 col1\" >Tabular Categorical Bar Plots</td>\n", + " <td id=\"T_0502a_row39_col2\" class=\"data row39 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", + " <td id=\"T_0502a_row39_col3\" class=\"data row39 col3\" >True</td>\n", + " <td id=\"T_0502a_row39_col4\" class=\"data row39 col4\" >False</td>\n", + " <td id=\"T_0502a_row39_col5\" class=\"data row39 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row39_col6\" class=\"data row39 col6\" >{}</td>\n", + " <td id=\"T_0502a_row39_col7\" class=\"data row39 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row39_col8\" class=\"data row39 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row40_col0\" class=\"data row40 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", + " <td id=\"T_0502a_row40_col1\" class=\"data row40 col1\" >Tabular Date Time Histograms</td>\n", + " <td id=\"T_0502a_row40_col2\" class=\"data row40 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", + " <td id=\"T_0502a_row40_col3\" class=\"data row40 col3\" >True</td>\n", + " <td id=\"T_0502a_row40_col4\" class=\"data row40 col4\" >False</td>\n", + " <td id=\"T_0502a_row40_col5\" class=\"data row40 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row40_col6\" class=\"data row40 col6\" >{}</td>\n", + " <td id=\"T_0502a_row40_col7\" class=\"data row40 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row40_col8\" class=\"data row40 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row41_col0\" class=\"data row41 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", + " <td id=\"T_0502a_row41_col1\" class=\"data row41 col1\" >Tabular Description Tables</td>\n", + " <td id=\"T_0502a_row41_col2\" class=\"data row41 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", + " <td id=\"T_0502a_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_0502a_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_0502a_row41_col5\" class=\"data row41 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row41_col6\" class=\"data row41 col6\" >{}</td>\n", + " <td id=\"T_0502a_row41_col7\" class=\"data row41 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row41_col8\" class=\"data row41 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row42_col0\" class=\"data row42 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", + " <td id=\"T_0502a_row42_col1\" class=\"data row42 col1\" >Tabular Numerical Histograms</td>\n", + " <td id=\"T_0502a_row42_col2\" class=\"data row42 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", + " <td id=\"T_0502a_row42_col3\" class=\"data row42 col3\" >True</td>\n", + " <td id=\"T_0502a_row42_col4\" class=\"data row42 col4\" >False</td>\n", + " <td id=\"T_0502a_row42_col5\" class=\"data row42 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row42_col6\" class=\"data row42 col6\" >{}</td>\n", + " <td id=\"T_0502a_row42_col7\" class=\"data row42 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row43_col0\" class=\"data row43 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", + " <td id=\"T_0502a_row43_col1\" class=\"data row43 col1\" >Target Rate Bar Plots</td>\n", + " <td id=\"T_0502a_row43_col2\" class=\"data row43 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", + " <td id=\"T_0502a_row43_col3\" class=\"data row43 col3\" >True</td>\n", + " <td id=\"T_0502a_row43_col4\" class=\"data row43 col4\" >False</td>\n", + " <td id=\"T_0502a_row43_col5\" class=\"data row43 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row43_col6\" class=\"data row43 col6\" >{}</td>\n", + " <td id=\"T_0502a_row43_col7\" class=\"data row43 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row43_col8\" class=\"data row43 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row44_col0\" class=\"data row44 col0\" >validmind.data_validation.TimeSeriesDescription</td>\n", + " <td id=\"T_0502a_row44_col1\" class=\"data row44 col1\" >Time Series Description</td>\n", + " <td id=\"T_0502a_row44_col2\" class=\"data row44 col2\" >Generates a detailed analysis for the provided time series dataset, summarizing key statistics to identify trends,...</td>\n", + " <td id=\"T_0502a_row44_col3\" class=\"data row44 col3\" >False</td>\n", + " <td id=\"T_0502a_row44_col4\" class=\"data row44 col4\" >True</td>\n", + " <td id=\"T_0502a_row44_col5\" class=\"data row44 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row44_col6\" class=\"data row44 col6\" >{}</td>\n", + " <td id=\"T_0502a_row44_col7\" class=\"data row44 col7\" >['time_series_data', 'analysis']</td>\n", + " <td id=\"T_0502a_row44_col8\" class=\"data row44 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row45_col0\" class=\"data row45 col0\" >validmind.data_validation.TimeSeriesDescriptiveStatistics</td>\n", + " <td id=\"T_0502a_row45_col1\" class=\"data row45 col1\" >Time Series Descriptive Statistics</td>\n", + " <td id=\"T_0502a_row45_col2\" class=\"data row45 col2\" >Evaluates the descriptive statistics of a time series dataset to identify trends, patterns, and data quality issues....</td>\n", + " <td id=\"T_0502a_row45_col3\" class=\"data row45 col3\" >False</td>\n", + " <td id=\"T_0502a_row45_col4\" class=\"data row45 col4\" >True</td>\n", + " <td id=\"T_0502a_row45_col5\" class=\"data row45 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row45_col6\" class=\"data row45 col6\" >{}</td>\n", + " <td id=\"T_0502a_row45_col7\" class=\"data row45 col7\" >['time_series_data', 'analysis']</td>\n", + " <td id=\"T_0502a_row45_col8\" class=\"data row45 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row46_col0\" class=\"data row46 col0\" >validmind.data_validation.TimeSeriesFrequency</td>\n", + " <td id=\"T_0502a_row46_col1\" class=\"data row46 col1\" >Time Series Frequency</td>\n", + " <td id=\"T_0502a_row46_col2\" class=\"data row46 col2\" >Evaluates consistency of time series data frequency and generates a frequency plot....</td>\n", + " <td id=\"T_0502a_row46_col3\" class=\"data row46 col3\" >True</td>\n", + " <td id=\"T_0502a_row46_col4\" class=\"data row46 col4\" >True</td>\n", + " <td id=\"T_0502a_row46_col5\" class=\"data row46 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row46_col6\" class=\"data row46 col6\" >{}</td>\n", + " <td id=\"T_0502a_row46_col7\" class=\"data row46 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row46_col8\" class=\"data row46 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row47_col0\" class=\"data row47 col0\" >validmind.data_validation.TimeSeriesHistogram</td>\n", + " <td id=\"T_0502a_row47_col1\" class=\"data row47 col1\" >Time Series Histogram</td>\n", + " <td id=\"T_0502a_row47_col2\" class=\"data row47 col2\" >Visualizes distribution of time-series data using histograms and Kernel Density Estimation (KDE) lines....</td>\n", + " <td id=\"T_0502a_row47_col3\" class=\"data row47 col3\" >True</td>\n", + " <td id=\"T_0502a_row47_col4\" class=\"data row47 col4\" >False</td>\n", + " <td id=\"T_0502a_row47_col5\" class=\"data row47 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row47_col6\" class=\"data row47 col6\" >{'nbins': {'type': '_empty', 'default': 30}}</td>\n", + " <td id=\"T_0502a_row47_col7\" class=\"data row47 col7\" >['data_validation', 'visualization', 'time_series_data']</td>\n", + " <td id=\"T_0502a_row47_col8\" class=\"data row47 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row48_col0\" class=\"data row48 col0\" >validmind.data_validation.TimeSeriesLinePlot</td>\n", + " <td id=\"T_0502a_row48_col1\" class=\"data row48 col1\" >Time Series Line Plot</td>\n", + " <td id=\"T_0502a_row48_col2\" class=\"data row48 col2\" >Generates and analyses time-series data through line plots revealing trends, patterns, anomalies over time....</td>\n", + " <td id=\"T_0502a_row48_col3\" class=\"data row48 col3\" >True</td>\n", + " <td id=\"T_0502a_row48_col4\" class=\"data row48 col4\" >False</td>\n", + " <td id=\"T_0502a_row48_col5\" class=\"data row48 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row48_col6\" class=\"data row48 col6\" >{}</td>\n", + " <td id=\"T_0502a_row48_col7\" class=\"data row48 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row48_col8\" class=\"data row48 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row49_col0\" class=\"data row49 col0\" >validmind.data_validation.TimeSeriesMissingValues</td>\n", + " <td id=\"T_0502a_row49_col1\" class=\"data row49 col1\" >Time Series Missing Values</td>\n", + " <td id=\"T_0502a_row49_col2\" class=\"data row49 col2\" >Validates time-series data quality by confirming the count of missing values is below a certain threshold....</td>\n", + " <td id=\"T_0502a_row49_col3\" class=\"data row49 col3\" >True</td>\n", + " <td id=\"T_0502a_row49_col4\" class=\"data row49 col4\" >True</td>\n", + " <td id=\"T_0502a_row49_col5\" class=\"data row49 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row49_col6\" class=\"data row49 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row49_col7\" class=\"data row49 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row49_col8\" class=\"data row49 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row50_col0\" class=\"data row50 col0\" >validmind.data_validation.TimeSeriesOutliers</td>\n", + " <td id=\"T_0502a_row50_col1\" class=\"data row50 col1\" >Time Series Outliers</td>\n", + " <td id=\"T_0502a_row50_col2\" class=\"data row50 col2\" >Identifies and visualizes outliers in time-series data using the z-score method....</td>\n", + " <td id=\"T_0502a_row50_col3\" class=\"data row50 col3\" >False</td>\n", + " <td id=\"T_0502a_row50_col4\" class=\"data row50 col4\" >True</td>\n", + " <td id=\"T_0502a_row50_col5\" class=\"data row50 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row50_col6\" class=\"data row50 col6\" >{'zscore_threshold': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row50_col7\" class=\"data row50 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row50_col8\" class=\"data row50 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row51_col0\" class=\"data row51 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_0502a_row51_col1\" class=\"data row51 col1\" >Too Many Zero Values</td>\n", + " <td id=\"T_0502a_row51_col2\" class=\"data row51 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", + " <td id=\"T_0502a_row51_col3\" class=\"data row51 col3\" >False</td>\n", + " <td id=\"T_0502a_row51_col4\" class=\"data row51 col4\" >True</td>\n", + " <td id=\"T_0502a_row51_col5\" class=\"data row51 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row51_col6\" class=\"data row51 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", + " <td id=\"T_0502a_row51_col7\" class=\"data row51 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row51_col8\" class=\"data row51 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row52_col0\" class=\"data row52 col0\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_0502a_row52_col1\" class=\"data row52 col1\" >Unique Rows</td>\n", + " <td id=\"T_0502a_row52_col2\" class=\"data row52 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", + " <td id=\"T_0502a_row52_col3\" class=\"data row52 col3\" >False</td>\n", + " <td id=\"T_0502a_row52_col4\" class=\"data row52 col4\" >True</td>\n", + " <td id=\"T_0502a_row52_col5\" class=\"data row52 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row52_col6\" class=\"data row52 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row52_col7\" class=\"data row52 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row52_col8\" class=\"data row52 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row53_col0\" class=\"data row53 col0\" >validmind.data_validation.WOEBinPlots</td>\n", + " <td id=\"T_0502a_row53_col1\" class=\"data row53 col1\" >WOE Bin Plots</td>\n", + " <td id=\"T_0502a_row53_col2\" class=\"data row53 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", + " <td id=\"T_0502a_row53_col3\" class=\"data row53 col3\" >True</td>\n", + " <td id=\"T_0502a_row53_col4\" class=\"data row53 col4\" >False</td>\n", + " <td id=\"T_0502a_row53_col5\" class=\"data row53 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row53_col6\" class=\"data row53 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row53_col7\" class=\"data row53 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row53_col8\" class=\"data row53 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row54_col0\" class=\"data row54 col0\" >validmind.data_validation.WOEBinTable</td>\n", + " <td id=\"T_0502a_row54_col1\" class=\"data row54 col1\" >WOE Bin Table</td>\n", + " <td id=\"T_0502a_row54_col2\" class=\"data row54 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", + " <td id=\"T_0502a_row54_col3\" class=\"data row54 col3\" >False</td>\n", + " <td id=\"T_0502a_row54_col4\" class=\"data row54 col4\" >True</td>\n", + " <td id=\"T_0502a_row54_col5\" class=\"data row54 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row54_col6\" class=\"data row54 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row54_col7\" class=\"data row54 col7\" >['tabular_data', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row54_col8\" class=\"data row54 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row55_col0\" class=\"data row55 col0\" >validmind.data_validation.ZivotAndrewsArch</td>\n", + " <td id=\"T_0502a_row55_col1\" class=\"data row55 col1\" >Zivot Andrews Arch</td>\n", + " <td id=\"T_0502a_row55_col2\" class=\"data row55 col2\" >Evaluates the order of integration and stationarity of time series data using the Zivot-Andrews unit root test....</td>\n", + " <td id=\"T_0502a_row55_col3\" class=\"data row55 col3\" >False</td>\n", + " <td id=\"T_0502a_row55_col4\" class=\"data row55 col4\" >True</td>\n", + " <td id=\"T_0502a_row55_col5\" class=\"data row55 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row55_col6\" class=\"data row55 col6\" >{}</td>\n", + " <td id=\"T_0502a_row55_col7\" class=\"data row55 col7\" >['time_series_data', 'stationarity', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row55_col8\" class=\"data row55 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row56_col0\" class=\"data row56 col0\" >validmind.data_validation.nlp.CommonWords</td>\n", + " <td id=\"T_0502a_row56_col1\" class=\"data row56 col1\" >Common Words</td>\n", + " <td id=\"T_0502a_row56_col2\" class=\"data row56 col2\" >Assesses the most frequent non-stopwords in a text column for identifying prevalent language patterns....</td>\n", + " <td id=\"T_0502a_row56_col3\" class=\"data row56 col3\" >True</td>\n", + " <td id=\"T_0502a_row56_col4\" class=\"data row56 col4\" >False</td>\n", + " <td id=\"T_0502a_row56_col5\" class=\"data row56 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row56_col6\" class=\"data row56 col6\" >{}</td>\n", + " <td id=\"T_0502a_row56_col7\" class=\"data row56 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row56_col8\" class=\"data row56 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row57_col0\" class=\"data row57 col0\" >validmind.data_validation.nlp.Hashtags</td>\n", + " <td id=\"T_0502a_row57_col1\" class=\"data row57 col1\" >Hashtags</td>\n", + " <td id=\"T_0502a_row57_col2\" class=\"data row57 col2\" >Assesses hashtag frequency in a text column, highlighting usage trends and potential dataset bias or spam....</td>\n", + " <td id=\"T_0502a_row57_col3\" class=\"data row57 col3\" >True</td>\n", + " <td id=\"T_0502a_row57_col4\" class=\"data row57 col4\" >False</td>\n", + " <td id=\"T_0502a_row57_col5\" class=\"data row57 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row57_col6\" class=\"data row57 col6\" >{'top_hashtags': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row57_col7\" class=\"data row57 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row57_col8\" class=\"data row57 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row58_col0\" class=\"data row58 col0\" >validmind.data_validation.nlp.LanguageDetection</td>\n", + " <td id=\"T_0502a_row58_col1\" class=\"data row58 col1\" >Language Detection</td>\n", + " <td id=\"T_0502a_row58_col2\" class=\"data row58 col2\" >Assesses the diversity of languages in a textual dataset by detecting and visualizing the distribution of languages....</td>\n", + " <td id=\"T_0502a_row58_col3\" class=\"data row58 col3\" >True</td>\n", + " <td id=\"T_0502a_row58_col4\" class=\"data row58 col4\" >False</td>\n", + " <td id=\"T_0502a_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row58_col6\" class=\"data row58 col6\" >{}</td>\n", + " <td id=\"T_0502a_row58_col7\" class=\"data row58 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row58_col8\" class=\"data row58 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row59_col0\" class=\"data row59 col0\" >validmind.data_validation.nlp.Mentions</td>\n", + " <td id=\"T_0502a_row59_col1\" class=\"data row59 col1\" >Mentions</td>\n", + " <td id=\"T_0502a_row59_col2\" class=\"data row59 col2\" >Calculates and visualizes frequencies of '@' prefixed mentions in a text-based dataset for NLP model analysis....</td>\n", + " <td id=\"T_0502a_row59_col3\" class=\"data row59 col3\" >True</td>\n", + " <td id=\"T_0502a_row59_col4\" class=\"data row59 col4\" >False</td>\n", + " <td id=\"T_0502a_row59_col5\" class=\"data row59 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row59_col6\" class=\"data row59 col6\" >{'top_mentions': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row59_col7\" class=\"data row59 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row59_col8\" class=\"data row59 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row60_col0\" class=\"data row60 col0\" >validmind.data_validation.nlp.PolarityAndSubjectivity</td>\n", + " <td id=\"T_0502a_row60_col1\" class=\"data row60 col1\" >Polarity And Subjectivity</td>\n", + " <td id=\"T_0502a_row60_col2\" class=\"data row60 col2\" >Analyzes the polarity and subjectivity of text data within a given dataset to visualize the sentiment distribution....</td>\n", + " <td id=\"T_0502a_row60_col3\" class=\"data row60 col3\" >True</td>\n", + " <td id=\"T_0502a_row60_col4\" class=\"data row60 col4\" >True</td>\n", + " <td id=\"T_0502a_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row60_col6\" class=\"data row60 col6\" >{'threshold_subjectivity': {'type': '_empty', 'default': 0.5}, 'threshold_polarity': {'type': '_empty', 'default': 0}}</td>\n", + " <td id=\"T_0502a_row60_col7\" class=\"data row60 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row60_col8\" class=\"data row60 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row61_col0\" class=\"data row61 col0\" >validmind.data_validation.nlp.Punctuations</td>\n", + " <td id=\"T_0502a_row61_col1\" class=\"data row61 col1\" >Punctuations</td>\n", + " <td id=\"T_0502a_row61_col2\" class=\"data row61 col2\" >Analyzes and visualizes the frequency distribution of punctuation usage in a given text dataset....</td>\n", + " <td id=\"T_0502a_row61_col3\" class=\"data row61 col3\" >True</td>\n", + " <td id=\"T_0502a_row61_col4\" class=\"data row61 col4\" >False</td>\n", + " <td id=\"T_0502a_row61_col5\" class=\"data row61 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row61_col6\" class=\"data row61 col6\" >{'count_mode': {'type': '_empty', 'default': 'token'}}</td>\n", + " <td id=\"T_0502a_row61_col7\" class=\"data row61 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row61_col8\" class=\"data row61 col8\" >['text_classification', 'text_summarization', 'nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row62_col0\" class=\"data row62 col0\" >validmind.data_validation.nlp.Sentiment</td>\n", + " <td id=\"T_0502a_row62_col1\" class=\"data row62 col1\" >Sentiment</td>\n", + " <td id=\"T_0502a_row62_col2\" class=\"data row62 col2\" >Analyzes the sentiment of text data within a dataset using the VADER sentiment analysis tool....</td>\n", + " <td id=\"T_0502a_row62_col3\" class=\"data row62 col3\" >True</td>\n", + " <td id=\"T_0502a_row62_col4\" class=\"data row62 col4\" >False</td>\n", + " <td id=\"T_0502a_row62_col5\" class=\"data row62 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row62_col6\" class=\"data row62 col6\" >{}</td>\n", + " <td id=\"T_0502a_row62_col7\" class=\"data row62 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row62_col8\" class=\"data row62 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row63_col0\" class=\"data row63 col0\" >validmind.data_validation.nlp.StopWords</td>\n", + " <td id=\"T_0502a_row63_col1\" class=\"data row63 col1\" >Stop Words</td>\n", + " <td id=\"T_0502a_row63_col2\" class=\"data row63 col2\" >Evaluates and visualizes the frequency of English stop words in a text dataset against a defined threshold....</td>\n", + " <td id=\"T_0502a_row63_col3\" class=\"data row63 col3\" >True</td>\n", + " <td id=\"T_0502a_row63_col4\" class=\"data row63 col4\" >True</td>\n", + " <td id=\"T_0502a_row63_col5\" class=\"data row63 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row63_col6\" class=\"data row63 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 0.5}, 'num_words': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row63_col7\" class=\"data row63 col7\" >['nlp', 'text_data', 'frequency_analysis', 'visualization']</td>\n", + " <td id=\"T_0502a_row63_col8\" class=\"data row63 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row64_col0\" class=\"data row64 col0\" >validmind.data_validation.nlp.TextDescription</td>\n", + " <td id=\"T_0502a_row64_col1\" class=\"data row64 col1\" >Text Description</td>\n", + " <td id=\"T_0502a_row64_col2\" class=\"data row64 col2\" >Conducts comprehensive textual analysis on a dataset using NLTK to evaluate various parameters and generate...</td>\n", + " <td id=\"T_0502a_row64_col3\" class=\"data row64 col3\" >True</td>\n", + " <td id=\"T_0502a_row64_col4\" class=\"data row64 col4\" >False</td>\n", + " <td id=\"T_0502a_row64_col5\" class=\"data row64 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row64_col6\" class=\"data row64 col6\" >{'unwanted_tokens': {'type': 'set', 'default': {'s', 'mrs', 'us', \"''\", ' ', 'ms', 'dr', 'dollar', '``', 'mr', \"'s\", \"s'\"}}, 'lang': {'type': 'str', 'default': 'english'}}</td>\n", + " <td id=\"T_0502a_row64_col7\" class=\"data row64 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row64_col8\" class=\"data row64 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row65_col0\" class=\"data row65 col0\" >validmind.data_validation.nlp.Toxicity</td>\n", + " <td id=\"T_0502a_row65_col1\" class=\"data row65 col1\" >Toxicity</td>\n", + " <td id=\"T_0502a_row65_col2\" class=\"data row65 col2\" >Assesses the toxicity of text data within a dataset to visualize the distribution of toxicity scores....</td>\n", + " <td id=\"T_0502a_row65_col3\" class=\"data row65 col3\" >True</td>\n", + " <td id=\"T_0502a_row65_col4\" class=\"data row65 col4\" >False</td>\n", + " <td id=\"T_0502a_row65_col5\" class=\"data row65 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row65_col6\" class=\"data row65 col6\" >{}</td>\n", + " <td id=\"T_0502a_row65_col7\" class=\"data row65 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row65_col8\" class=\"data row65 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row66_col0\" class=\"data row66 col0\" >validmind.model_validation.BertScore</td>\n", + " <td id=\"T_0502a_row66_col1\" class=\"data row66 col1\" >Bert Score</td>\n", + " <td id=\"T_0502a_row66_col2\" class=\"data row66 col2\" >Assesses the quality of machine-generated text using BERTScore metrics and visualizes results through histograms...</td>\n", + " <td id=\"T_0502a_row66_col3\" class=\"data row66 col3\" >True</td>\n", + " <td id=\"T_0502a_row66_col4\" class=\"data row66 col4\" >True</td>\n", + " <td id=\"T_0502a_row66_col5\" class=\"data row66 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row66_col6\" class=\"data row66 col6\" >{'evaluation_model': {'type': '_empty', 'default': 'distilbert-base-uncased'}}</td>\n", + " <td id=\"T_0502a_row66_col7\" class=\"data row66 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row66_col8\" class=\"data row66 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row67_col0\" class=\"data row67 col0\" >validmind.model_validation.BleuScore</td>\n", + " <td id=\"T_0502a_row67_col1\" class=\"data row67 col1\" >Bleu Score</td>\n", + " <td id=\"T_0502a_row67_col2\" class=\"data row67 col2\" >Evaluates the quality of machine-generated text using BLEU metrics and visualizes the results through histograms...</td>\n", + " <td id=\"T_0502a_row67_col3\" class=\"data row67 col3\" >True</td>\n", + " <td id=\"T_0502a_row67_col4\" class=\"data row67 col4\" >True</td>\n", + " <td id=\"T_0502a_row67_col5\" class=\"data row67 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row67_col6\" class=\"data row67 col6\" >{}</td>\n", + " <td id=\"T_0502a_row67_col7\" class=\"data row67 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row67_col8\" class=\"data row67 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row68_col0\" class=\"data row68 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " <td id=\"T_0502a_row68_col1\" class=\"data row68 col1\" >Cluster Size Distribution</td>\n", + " <td id=\"T_0502a_row68_col2\" class=\"data row68 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", + " <td id=\"T_0502a_row68_col3\" class=\"data row68 col3\" >True</td>\n", + " <td id=\"T_0502a_row68_col4\" class=\"data row68 col4\" >False</td>\n", + " <td id=\"T_0502a_row68_col5\" class=\"data row68 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row68_col6\" class=\"data row68 col6\" >{}</td>\n", + " <td id=\"T_0502a_row68_col7\" class=\"data row68 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row68_col8\" class=\"data row68 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row69_col0\" class=\"data row69 col0\" >validmind.model_validation.ContextualRecall</td>\n", + " <td id=\"T_0502a_row69_col1\" class=\"data row69 col1\" >Contextual Recall</td>\n", + " <td id=\"T_0502a_row69_col2\" class=\"data row69 col2\" >Evaluates a Natural Language Generation model's ability to generate contextually relevant and factually correct...</td>\n", + " <td id=\"T_0502a_row69_col3\" class=\"data row69 col3\" >True</td>\n", + " <td id=\"T_0502a_row69_col4\" class=\"data row69 col4\" >True</td>\n", + " <td id=\"T_0502a_row69_col5\" class=\"data row69 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row69_col6\" class=\"data row69 col6\" >{}</td>\n", + " <td id=\"T_0502a_row69_col7\" class=\"data row69 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row69_col8\" class=\"data row69 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row70_col0\" class=\"data row70 col0\" >validmind.model_validation.FeaturesAUC</td>\n", + " <td id=\"T_0502a_row70_col1\" class=\"data row70 col1\" >Features AUC</td>\n", + " <td id=\"T_0502a_row70_col2\" class=\"data row70 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", + " <td id=\"T_0502a_row70_col3\" class=\"data row70 col3\" >True</td>\n", + " <td id=\"T_0502a_row70_col4\" class=\"data row70 col4\" >False</td>\n", + " <td id=\"T_0502a_row70_col5\" class=\"data row70 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row70_col6\" class=\"data row70 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row70_col7\" class=\"data row70 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", + " <td id=\"T_0502a_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row71_col0\" class=\"data row71 col0\" >validmind.model_validation.MeteorScore</td>\n", + " <td id=\"T_0502a_row71_col1\" class=\"data row71 col1\" >Meteor Score</td>\n", + " <td id=\"T_0502a_row71_col2\" class=\"data row71 col2\" >Assesses the quality of machine-generated translations by comparing them to human-produced references using the...</td>\n", + " <td id=\"T_0502a_row71_col3\" class=\"data row71 col3\" >True</td>\n", + " <td id=\"T_0502a_row71_col4\" class=\"data row71 col4\" >True</td>\n", + " <td id=\"T_0502a_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row71_col6\" class=\"data row71 col6\" >{}</td>\n", + " <td id=\"T_0502a_row71_col7\" class=\"data row71 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row71_col8\" class=\"data row71 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row72_col0\" class=\"data row72 col0\" >validmind.model_validation.ModelMetadata</td>\n", + " <td id=\"T_0502a_row72_col1\" class=\"data row72 col1\" >Model Metadata</td>\n", + " <td id=\"T_0502a_row72_col2\" class=\"data row72 col2\" >Compare metadata of different models and generate a summary table with the results....</td>\n", + " <td id=\"T_0502a_row72_col3\" class=\"data row72 col3\" >False</td>\n", + " <td id=\"T_0502a_row72_col4\" class=\"data row72 col4\" >True</td>\n", + " <td id=\"T_0502a_row72_col5\" class=\"data row72 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row72_col6\" class=\"data row72 col6\" >{}</td>\n", + " <td id=\"T_0502a_row72_col7\" class=\"data row72 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_0502a_row72_col8\" class=\"data row72 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row73_col0\" class=\"data row73 col0\" >validmind.model_validation.ModelPredictionResiduals</td>\n", + " <td id=\"T_0502a_row73_col1\" class=\"data row73 col1\" >Model Prediction Residuals</td>\n", + " <td id=\"T_0502a_row73_col2\" class=\"data row73 col2\" >Assesses normality and behavior of residuals in regression models through visualization and statistical tests....</td>\n", + " <td id=\"T_0502a_row73_col3\" class=\"data row73 col3\" >True</td>\n", + " <td id=\"T_0502a_row73_col4\" class=\"data row73 col4\" >True</td>\n", + " <td id=\"T_0502a_row73_col5\" class=\"data row73 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row73_col6\" class=\"data row73 col6\" >{'nbins': {'type': 'int', 'default': 100}, 'p_value_threshold': {'type': 'float', 'default': 0.05}, 'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row73_col7\" class=\"data row73 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row73_col8\" class=\"data row73 col8\" >['residual_analysis', 'visualization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row74_col0\" class=\"data row74 col0\" >validmind.model_validation.RegardScore</td>\n", + " <td id=\"T_0502a_row74_col1\" class=\"data row74 col1\" >Regard Score</td>\n", + " <td id=\"T_0502a_row74_col2\" class=\"data row74 col2\" >Assesses the sentiment and potential biases in text generated by NLP models by computing and visualizing regard...</td>\n", + " <td id=\"T_0502a_row74_col3\" class=\"data row74 col3\" >True</td>\n", + " <td id=\"T_0502a_row74_col4\" class=\"data row74 col4\" >True</td>\n", + " <td id=\"T_0502a_row74_col5\" class=\"data row74 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row74_col6\" class=\"data row74 col6\" >{}</td>\n", + " <td id=\"T_0502a_row74_col7\" class=\"data row74 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row74_col8\" class=\"data row74 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row75_col0\" class=\"data row75 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", + " <td id=\"T_0502a_row75_col1\" class=\"data row75 col1\" >Regression Residuals Plot</td>\n", + " <td id=\"T_0502a_row75_col2\" class=\"data row75 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", + " <td id=\"T_0502a_row75_col3\" class=\"data row75 col3\" >True</td>\n", + " <td id=\"T_0502a_row75_col4\" class=\"data row75 col4\" >False</td>\n", + " <td id=\"T_0502a_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row75_col6\" class=\"data row75 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_0502a_row75_col7\" class=\"data row75 col7\" >['model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row75_col8\" class=\"data row75 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row76_col0\" class=\"data row76 col0\" >validmind.model_validation.RougeScore</td>\n", + " <td id=\"T_0502a_row76_col1\" class=\"data row76 col1\" >Rouge Score</td>\n", + " <td id=\"T_0502a_row76_col2\" class=\"data row76 col2\" >Assesses the quality of machine-generated text using ROUGE metrics and visualizes the results to provide...</td>\n", + " <td id=\"T_0502a_row76_col3\" class=\"data row76 col3\" >True</td>\n", + " <td id=\"T_0502a_row76_col4\" class=\"data row76 col4\" >True</td>\n", + " <td id=\"T_0502a_row76_col5\" class=\"data row76 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row76_col6\" class=\"data row76 col6\" >{'metric': {'type': 'str', 'default': 'rouge-1'}}</td>\n", + " <td id=\"T_0502a_row76_col7\" class=\"data row76 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row76_col8\" class=\"data row76 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row77_col0\" class=\"data row77 col0\" >validmind.model_validation.TimeSeriesPredictionWithCI</td>\n", + " <td id=\"T_0502a_row77_col1\" class=\"data row77 col1\" >Time Series Prediction With CI</td>\n", + " <td id=\"T_0502a_row77_col2\" class=\"data row77 col2\" >Assesses predictive accuracy and uncertainty in time series models, highlighting breaches beyond confidence...</td>\n", + " <td id=\"T_0502a_row77_col3\" class=\"data row77 col3\" >True</td>\n", + " <td id=\"T_0502a_row77_col4\" class=\"data row77 col4\" >True</td>\n", + " <td id=\"T_0502a_row77_col5\" class=\"data row77 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row77_col6\" class=\"data row77 col6\" >{'confidence': {'type': 'float', 'default': 0.95}}</td>\n", + " <td id=\"T_0502a_row77_col7\" class=\"data row77 col7\" >['model_predictions', 'visualization']</td>\n", + " <td id=\"T_0502a_row77_col8\" class=\"data row77 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row78_col0\" class=\"data row78 col0\" >validmind.model_validation.TimeSeriesPredictionsPlot</td>\n", + " <td id=\"T_0502a_row78_col1\" class=\"data row78 col1\" >Time Series Predictions Plot</td>\n", + " <td id=\"T_0502a_row78_col2\" class=\"data row78 col2\" >Plot actual vs predicted values for time series data and generate a visual comparison for the model....</td>\n", + " <td id=\"T_0502a_row78_col3\" class=\"data row78 col3\" >True</td>\n", + " <td id=\"T_0502a_row78_col4\" class=\"data row78 col4\" >False</td>\n", + " <td id=\"T_0502a_row78_col5\" class=\"data row78 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row78_col6\" class=\"data row78 col6\" >{}</td>\n", + " <td id=\"T_0502a_row78_col7\" class=\"data row78 col7\" >['model_predictions', 'visualization']</td>\n", + " <td id=\"T_0502a_row78_col8\" class=\"data row78 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row79_col0\" class=\"data row79 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", + " <td id=\"T_0502a_row79_col1\" class=\"data row79 col1\" >Time Series R2 Square By Segments</td>\n", + " <td id=\"T_0502a_row79_col2\" class=\"data row79 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", + " <td id=\"T_0502a_row79_col3\" class=\"data row79 col3\" >True</td>\n", + " <td id=\"T_0502a_row79_col4\" class=\"data row79 col4\" >True</td>\n", + " <td id=\"T_0502a_row79_col5\" class=\"data row79 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row79_col6\" class=\"data row79 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row79_col7\" class=\"data row79 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row79_col8\" class=\"data row79 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row80_col0\" class=\"data row80 col0\" >validmind.model_validation.TokenDisparity</td>\n", + " <td id=\"T_0502a_row80_col1\" class=\"data row80 col1\" >Token Disparity</td>\n", + " <td id=\"T_0502a_row80_col2\" class=\"data row80 col2\" >Evaluates the token disparity between reference and generated texts, visualizing the results through histograms and...</td>\n", + " <td id=\"T_0502a_row80_col3\" class=\"data row80 col3\" >True</td>\n", + " <td id=\"T_0502a_row80_col4\" class=\"data row80 col4\" >True</td>\n", + " <td id=\"T_0502a_row80_col5\" class=\"data row80 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row80_col6\" class=\"data row80 col6\" >{}</td>\n", + " <td id=\"T_0502a_row80_col7\" class=\"data row80 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row80_col8\" class=\"data row80 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row81_col0\" class=\"data row81 col0\" >validmind.model_validation.ToxicityScore</td>\n", + " <td id=\"T_0502a_row81_col1\" class=\"data row81 col1\" >Toxicity Score</td>\n", + " <td id=\"T_0502a_row81_col2\" class=\"data row81 col2\" >Assesses the toxicity levels of texts generated by NLP models to identify and mitigate harmful or offensive content....</td>\n", + " <td id=\"T_0502a_row81_col3\" class=\"data row81 col3\" >True</td>\n", + " <td id=\"T_0502a_row81_col4\" class=\"data row81 col4\" >True</td>\n", + " <td id=\"T_0502a_row81_col5\" class=\"data row81 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row81_col6\" class=\"data row81 col6\" >{}</td>\n", + " <td id=\"T_0502a_row81_col7\" class=\"data row81 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row81_col8\" class=\"data row81 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row82_col0\" class=\"data row82 col0\" >validmind.model_validation.embeddings.ClusterDistribution</td>\n", + " <td id=\"T_0502a_row82_col1\" class=\"data row82 col1\" >Cluster Distribution</td>\n", + " <td id=\"T_0502a_row82_col2\" class=\"data row82 col2\" >Assesses the distribution of text embeddings across clusters produced by a model using KMeans clustering....</td>\n", + " <td id=\"T_0502a_row82_col3\" class=\"data row82 col3\" >True</td>\n", + " <td id=\"T_0502a_row82_col4\" class=\"data row82 col4\" >False</td>\n", + " <td id=\"T_0502a_row82_col5\" class=\"data row82 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row82_col6\" class=\"data row82 col6\" >{'num_clusters': {'type': 'int', 'default': 5}}</td>\n", + " <td id=\"T_0502a_row82_col7\" class=\"data row82 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row82_col8\" class=\"data row82 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row83_col0\" class=\"data row83 col0\" >validmind.model_validation.embeddings.CosineSimilarityComparison</td>\n", + " <td id=\"T_0502a_row83_col1\" class=\"data row83 col1\" >Cosine Similarity Comparison</td>\n", + " <td id=\"T_0502a_row83_col2\" class=\"data row83 col2\" >Assesses the similarity between embeddings generated by different models using Cosine Similarity, providing both...</td>\n", + " <td id=\"T_0502a_row83_col3\" class=\"data row83 col3\" >True</td>\n", + " <td id=\"T_0502a_row83_col4\" class=\"data row83 col4\" >True</td>\n", + " <td id=\"T_0502a_row83_col5\" class=\"data row83 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row83_col6\" class=\"data row83 col6\" >{}</td>\n", + " <td id=\"T_0502a_row83_col7\" class=\"data row83 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row83_col8\" class=\"data row83 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row84_col0\" class=\"data row84 col0\" >validmind.model_validation.embeddings.CosineSimilarityDistribution</td>\n", + " <td id=\"T_0502a_row84_col1\" class=\"data row84 col1\" >Cosine Similarity Distribution</td>\n", + " <td id=\"T_0502a_row84_col2\" class=\"data row84 col2\" >Assesses the similarity between predicted text embeddings from a model using a Cosine Similarity distribution...</td>\n", + " <td id=\"T_0502a_row84_col3\" class=\"data row84 col3\" >True</td>\n", + " <td id=\"T_0502a_row84_col4\" class=\"data row84 col4\" >False</td>\n", + " <td id=\"T_0502a_row84_col5\" class=\"data row84 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row84_col6\" class=\"data row84 col6\" >{}</td>\n", + " <td id=\"T_0502a_row84_col7\" class=\"data row84 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row84_col8\" class=\"data row84 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row85_col0\" class=\"data row85 col0\" >validmind.model_validation.embeddings.CosineSimilarityHeatmap</td>\n", + " <td id=\"T_0502a_row85_col1\" class=\"data row85 col1\" >Cosine Similarity Heatmap</td>\n", + " <td id=\"T_0502a_row85_col2\" class=\"data row85 col2\" >Generates an interactive heatmap to visualize the cosine similarities among embeddings derived from a given model....</td>\n", + " <td id=\"T_0502a_row85_col3\" class=\"data row85 col3\" >True</td>\n", + " <td id=\"T_0502a_row85_col4\" class=\"data row85 col4\" >False</td>\n", + " <td id=\"T_0502a_row85_col5\" class=\"data row85 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row85_col6\" class=\"data row85 col6\" >{'title': {'type': '_empty', 'default': 'Cosine Similarity Matrix'}, 'color': {'type': '_empty', 'default': 'Cosine Similarity'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", + " <td id=\"T_0502a_row85_col7\" class=\"data row85 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row85_col8\" class=\"data row85 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row86_col0\" class=\"data row86 col0\" >validmind.model_validation.embeddings.DescriptiveAnalytics</td>\n", + " <td id=\"T_0502a_row86_col1\" class=\"data row86 col1\" >Descriptive Analytics</td>\n", + " <td id=\"T_0502a_row86_col2\" class=\"data row86 col2\" >Evaluates statistical properties of text embeddings in an ML model via mean, median, and standard deviation...</td>\n", + " <td id=\"T_0502a_row86_col3\" class=\"data row86 col3\" >True</td>\n", + " <td id=\"T_0502a_row86_col4\" class=\"data row86 col4\" >False</td>\n", + " <td id=\"T_0502a_row86_col5\" class=\"data row86 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row86_col6\" class=\"data row86 col6\" >{}</td>\n", + " <td id=\"T_0502a_row86_col7\" class=\"data row86 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row86_col8\" class=\"data row86 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row87_col0\" class=\"data row87 col0\" >validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", + " <td id=\"T_0502a_row87_col1\" class=\"data row87 col1\" >Embeddings Visualization2 D</td>\n", + " <td id=\"T_0502a_row87_col2\" class=\"data row87 col2\" >Visualizes 2D representation of text embeddings generated by a model using t-SNE technique....</td>\n", + " <td id=\"T_0502a_row87_col3\" class=\"data row87 col3\" >True</td>\n", + " <td id=\"T_0502a_row87_col4\" class=\"data row87 col4\" >False</td>\n", + " <td id=\"T_0502a_row87_col5\" class=\"data row87 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row87_col6\" class=\"data row87 col6\" >{'cluster_column': {'type': None, 'default': None}, 'perplexity': {'type': 'int', 'default': 30}}</td>\n", + " <td id=\"T_0502a_row87_col7\" class=\"data row87 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row87_col8\" class=\"data row87 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row88_col0\" class=\"data row88 col0\" >validmind.model_validation.embeddings.EuclideanDistanceComparison</td>\n", + " <td id=\"T_0502a_row88_col1\" class=\"data row88 col1\" >Euclidean Distance Comparison</td>\n", + " <td id=\"T_0502a_row88_col2\" class=\"data row88 col2\" >Assesses and visualizes the dissimilarity between model embeddings using Euclidean distance, providing insights...</td>\n", + " <td id=\"T_0502a_row88_col3\" class=\"data row88 col3\" >True</td>\n", + " <td id=\"T_0502a_row88_col4\" class=\"data row88 col4\" >True</td>\n", + " <td id=\"T_0502a_row88_col5\" class=\"data row88 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row88_col6\" class=\"data row88 col6\" >{}</td>\n", + " <td id=\"T_0502a_row88_col7\" class=\"data row88 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row88_col8\" class=\"data row88 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row89_col0\" class=\"data row89 col0\" >validmind.model_validation.embeddings.EuclideanDistanceHeatmap</td>\n", + " <td id=\"T_0502a_row89_col1\" class=\"data row89 col1\" >Euclidean Distance Heatmap</td>\n", + " <td id=\"T_0502a_row89_col2\" class=\"data row89 col2\" >Generates an interactive heatmap to visualize the Euclidean distances among embeddings derived from a given model....</td>\n", + " <td id=\"T_0502a_row89_col3\" class=\"data row89 col3\" >True</td>\n", + " <td id=\"T_0502a_row89_col4\" class=\"data row89 col4\" >False</td>\n", + " <td id=\"T_0502a_row89_col5\" class=\"data row89 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row89_col6\" class=\"data row89 col6\" >{'title': {'type': '_empty', 'default': 'Euclidean Distance Matrix'}, 'color': {'type': '_empty', 'default': 'Euclidean Distance'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", + " <td id=\"T_0502a_row89_col7\" class=\"data row89 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row89_col8\" class=\"data row89 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row90_col0\" class=\"data row90 col0\" >validmind.model_validation.embeddings.PCAComponentsPairwisePlots</td>\n", + " <td id=\"T_0502a_row90_col1\" class=\"data row90 col1\" >PCA Components Pairwise Plots</td>\n", + " <td id=\"T_0502a_row90_col2\" class=\"data row90 col2\" >Generates scatter plots for pairwise combinations of principal component analysis (PCA) components of model...</td>\n", + " <td id=\"T_0502a_row90_col3\" class=\"data row90 col3\" >True</td>\n", + " <td id=\"T_0502a_row90_col4\" class=\"data row90 col4\" >False</td>\n", + " <td id=\"T_0502a_row90_col5\" class=\"data row90 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row90_col6\" class=\"data row90 col6\" >{'n_components': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row90_col7\" class=\"data row90 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row90_col8\" class=\"data row90 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row91_col0\" class=\"data row91 col0\" >validmind.model_validation.embeddings.StabilityAnalysisKeyword</td>\n", + " <td id=\"T_0502a_row91_col1\" class=\"data row91 col1\" >Stability Analysis Keyword</td>\n", + " <td id=\"T_0502a_row91_col2\" class=\"data row91 col2\" >Evaluates robustness of embedding models to keyword swaps in the test dataset....</td>\n", + " <td id=\"T_0502a_row91_col3\" class=\"data row91 col3\" >True</td>\n", + " <td id=\"T_0502a_row91_col4\" class=\"data row91 col4\" >True</td>\n", + " <td id=\"T_0502a_row91_col5\" class=\"data row91 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row91_col6\" class=\"data row91 col6\" >{'keyword_dict': {'type': None, 'default': None}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row91_col7\" class=\"data row91 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row91_col8\" class=\"data row91 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row92_col0\" class=\"data row92 col0\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise</td>\n", + " <td id=\"T_0502a_row92_col1\" class=\"data row92 col1\" >Stability Analysis Random Noise</td>\n", + " <td id=\"T_0502a_row92_col2\" class=\"data row92 col2\" >Assesses the robustness of text embeddings models to random noise introduced via text perturbations....</td>\n", + " <td id=\"T_0502a_row92_col3\" class=\"data row92 col3\" >True</td>\n", + " <td id=\"T_0502a_row92_col4\" class=\"data row92 col4\" >True</td>\n", + " <td id=\"T_0502a_row92_col5\" class=\"data row92 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row92_col6\" class=\"data row92 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row92_col7\" class=\"data row92 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row92_col8\" class=\"data row92 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row93_col0\" class=\"data row93 col0\" >validmind.model_validation.embeddings.StabilityAnalysisSynonyms</td>\n", + " <td id=\"T_0502a_row93_col1\" class=\"data row93 col1\" >Stability Analysis Synonyms</td>\n", + " <td id=\"T_0502a_row93_col2\" class=\"data row93 col2\" >Evaluates the stability of text embeddings models when words in test data are replaced by their synonyms randomly....</td>\n", + " <td id=\"T_0502a_row93_col3\" class=\"data row93 col3\" >True</td>\n", + " <td id=\"T_0502a_row93_col4\" class=\"data row93 col4\" >True</td>\n", + " <td id=\"T_0502a_row93_col5\" class=\"data row93 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row93_col6\" class=\"data row93 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row93_col7\" class=\"data row93 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row93_col8\" class=\"data row93 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row94_col0\" class=\"data row94 col0\" >validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " <td id=\"T_0502a_row94_col1\" class=\"data row94 col1\" >Stability Analysis Translation</td>\n", + " <td id=\"T_0502a_row94_col2\" class=\"data row94 col2\" >Evaluates robustness of text embeddings models to noise introduced by translating the original text to another...</td>\n", + " <td id=\"T_0502a_row94_col3\" class=\"data row94 col3\" >True</td>\n", + " <td id=\"T_0502a_row94_col4\" class=\"data row94 col4\" >True</td>\n", + " <td id=\"T_0502a_row94_col5\" class=\"data row94 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row94_col6\" class=\"data row94 col6\" >{'source_lang': {'type': 'str', 'default': 'en'}, 'target_lang': {'type': 'str', 'default': 'fr'}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row94_col7\" class=\"data row94 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row94_col8\" class=\"data row94 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row95_col0\" class=\"data row95 col0\" >validmind.model_validation.embeddings.TSNEComponentsPairwisePlots</td>\n", + " <td id=\"T_0502a_row95_col1\" class=\"data row95 col1\" >TSNE Components Pairwise Plots</td>\n", + " <td id=\"T_0502a_row95_col2\" class=\"data row95 col2\" >Creates scatter plots for pairwise combinations of t-SNE components to visualize embeddings and highlight potential...</td>\n", + " <td id=\"T_0502a_row95_col3\" class=\"data row95 col3\" >True</td>\n", + " <td id=\"T_0502a_row95_col4\" class=\"data row95 col4\" >False</td>\n", + " <td id=\"T_0502a_row95_col5\" class=\"data row95 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row95_col6\" class=\"data row95 col6\" >{'n_components': {'type': 'int', 'default': 2}, 'perplexity': {'type': 'int', 'default': 30}, 'title': {'type': 'str', 'default': 't-SNE'}}</td>\n", + " <td id=\"T_0502a_row95_col7\" class=\"data row95 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row95_col8\" class=\"data row95 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row96_col0\" class=\"data row96 col0\" >validmind.model_validation.ragas.AnswerCorrectness</td>\n", + " <td id=\"T_0502a_row96_col1\" class=\"data row96 col1\" >Answer Correctness</td>\n", + " <td id=\"T_0502a_row96_col2\" class=\"data row96 col2\" >Evaluates the correctness of answers in a dataset with respect to the provided ground...</td>\n", + " <td id=\"T_0502a_row96_col3\" class=\"data row96 col3\" >True</td>\n", + " <td id=\"T_0502a_row96_col4\" class=\"data row96 col4\" >True</td>\n", + " <td id=\"T_0502a_row96_col5\" class=\"data row96 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row96_col6\" class=\"data row96 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row96_col7\" class=\"data row96 col7\" >['ragas', 'llm']</td>\n", + " <td id=\"T_0502a_row96_col8\" class=\"data row96 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row97_col0\" class=\"data row97 col0\" >validmind.model_validation.ragas.AspectCritic</td>\n", + " <td id=\"T_0502a_row97_col1\" class=\"data row97 col1\" >Aspect Critic</td>\n", + " <td id=\"T_0502a_row97_col2\" class=\"data row97 col2\" >Evaluates generations against the following aspects: harmfulness, maliciousness,...</td>\n", + " <td id=\"T_0502a_row97_col3\" class=\"data row97 col3\" >True</td>\n", + " <td id=\"T_0502a_row97_col4\" class=\"data row97 col4\" >True</td>\n", + " <td id=\"T_0502a_row97_col5\" class=\"data row97 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row97_col6\" class=\"data row97 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': None, 'default': None}, 'aspects': {'type': None, 'default': ['coherence', 'conciseness', 'correctness', 'harmfulness', 'maliciousness']}, 'additional_aspects': {'type': None, 'default': None}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row97_col7\" class=\"data row97 col7\" >['ragas', 'llm', 'qualitative']</td>\n", + " <td id=\"T_0502a_row97_col8\" class=\"data row97 col8\" >['text_summarization', 'text_generation', 'text_qa']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row98_col0\" class=\"data row98 col0\" >validmind.model_validation.ragas.ContextEntityRecall</td>\n", + " <td id=\"T_0502a_row98_col1\" class=\"data row98 col1\" >Context Entity Recall</td>\n", + " <td id=\"T_0502a_row98_col2\" class=\"data row98 col2\" >Evaluates the context entity recall for dataset entries and visualizes the results....</td>\n", + " <td id=\"T_0502a_row98_col3\" class=\"data row98 col3\" >True</td>\n", + " <td id=\"T_0502a_row98_col4\" class=\"data row98 col4\" >True</td>\n", + " <td id=\"T_0502a_row98_col5\" class=\"data row98 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row98_col6\" class=\"data row98 col6\" >{'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row98_col7\" class=\"data row98 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row98_col8\" class=\"data row98 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row99_col0\" class=\"data row99 col0\" >validmind.model_validation.ragas.ContextPrecision</td>\n", + " <td id=\"T_0502a_row99_col1\" class=\"data row99 col1\" >Context Precision</td>\n", + " <td id=\"T_0502a_row99_col2\" class=\"data row99 col2\" >Context Precision is a metric that evaluates whether all of the ground-truth...</td>\n", + " <td id=\"T_0502a_row99_col3\" class=\"data row99 col3\" >True</td>\n", + " <td id=\"T_0502a_row99_col4\" class=\"data row99 col4\" >True</td>\n", + " <td id=\"T_0502a_row99_col5\" class=\"data row99 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row99_col6\" class=\"data row99 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row99_col7\" class=\"data row99 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row99_col8\" class=\"data row99 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row100_col0\" class=\"data row100 col0\" >validmind.model_validation.ragas.ContextPrecisionWithoutReference</td>\n", + " <td id=\"T_0502a_row100_col1\" class=\"data row100 col1\" >Context Precision Without Reference</td>\n", + " <td id=\"T_0502a_row100_col2\" class=\"data row100 col2\" >Context Precision Without Reference is a metric used to evaluate the relevance of...</td>\n", + " <td id=\"T_0502a_row100_col3\" class=\"data row100 col3\" >True</td>\n", + " <td id=\"T_0502a_row100_col4\" class=\"data row100 col4\" >True</td>\n", + " <td id=\"T_0502a_row100_col5\" class=\"data row100 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row100_col6\" class=\"data row100 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row100_col7\" class=\"data row100 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row100_col8\" class=\"data row100 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row101_col0\" class=\"data row101 col0\" >validmind.model_validation.ragas.ContextRecall</td>\n", + " <td id=\"T_0502a_row101_col1\" class=\"data row101 col1\" >Context Recall</td>\n", + " <td id=\"T_0502a_row101_col2\" class=\"data row101 col2\" >Context recall measures the extent to which the retrieved context aligns with the...</td>\n", + " <td id=\"T_0502a_row101_col3\" class=\"data row101 col3\" >True</td>\n", + " <td id=\"T_0502a_row101_col4\" class=\"data row101 col4\" >True</td>\n", + " <td id=\"T_0502a_row101_col5\" class=\"data row101 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row101_col6\" class=\"data row101 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row101_col7\" class=\"data row101 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row101_col8\" class=\"data row101 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row102_col0\" class=\"data row102 col0\" >validmind.model_validation.ragas.Faithfulness</td>\n", + " <td id=\"T_0502a_row102_col1\" class=\"data row102 col1\" >Faithfulness</td>\n", + " <td id=\"T_0502a_row102_col2\" class=\"data row102 col2\" >Evaluates the faithfulness of the generated answers with respect to retrieved contexts....</td>\n", + " <td id=\"T_0502a_row102_col3\" class=\"data row102 col3\" >True</td>\n", + " <td id=\"T_0502a_row102_col4\" class=\"data row102 col4\" >True</td>\n", + " <td id=\"T_0502a_row102_col5\" class=\"data row102 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row102_col6\" class=\"data row102 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row102_col7\" class=\"data row102 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row102_col8\" class=\"data row102 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row103_col0\" class=\"data row103 col0\" >validmind.model_validation.ragas.NoiseSensitivity</td>\n", + " <td id=\"T_0502a_row103_col1\" class=\"data row103 col1\" >Noise Sensitivity</td>\n", + " <td id=\"T_0502a_row103_col2\" class=\"data row103 col2\" >Assesses the sensitivity of a Large Language Model (LLM) to noise in retrieved context by measuring how often it...</td>\n", + " <td id=\"T_0502a_row103_col3\" class=\"data row103 col3\" >True</td>\n", + " <td id=\"T_0502a_row103_col4\" class=\"data row103 col4\" >True</td>\n", + " <td id=\"T_0502a_row103_col5\" class=\"data row103 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row103_col6\" class=\"data row103 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'focus': {'type': 'str', 'default': 'relevant'}, 'user_input_column': {'type': 'str', 'default': 'user_input'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row103_col7\" class=\"data row103 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row103_col8\" class=\"data row103 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row104_col0\" class=\"data row104 col0\" >validmind.model_validation.ragas.ResponseRelevancy</td>\n", + " <td id=\"T_0502a_row104_col1\" class=\"data row104 col1\" >Response Relevancy</td>\n", + " <td id=\"T_0502a_row104_col2\" class=\"data row104 col2\" >Assesses how pertinent the generated answer is to the given prompt....</td>\n", + " <td id=\"T_0502a_row104_col3\" class=\"data row104 col3\" >True</td>\n", + " <td id=\"T_0502a_row104_col4\" class=\"data row104 col4\" >True</td>\n", + " <td id=\"T_0502a_row104_col5\" class=\"data row104 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row104_col6\" class=\"data row104 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': None}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row104_col7\" class=\"data row104 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row104_col8\" class=\"data row104 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row105_col0\" class=\"data row105 col0\" >validmind.model_validation.ragas.SemanticSimilarity</td>\n", + " <td id=\"T_0502a_row105_col1\" class=\"data row105 col1\" >Semantic Similarity</td>\n", + " <td id=\"T_0502a_row105_col2\" class=\"data row105 col2\" >Calculates the semantic similarity between generated responses and ground truths...</td>\n", + " <td id=\"T_0502a_row105_col3\" class=\"data row105 col3\" >True</td>\n", + " <td id=\"T_0502a_row105_col4\" class=\"data row105 col4\" >True</td>\n", + " <td id=\"T_0502a_row105_col5\" class=\"data row105 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row105_col6\" class=\"data row105 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row105_col7\" class=\"data row105 col7\" >['ragas', 'llm']</td>\n", + " <td id=\"T_0502a_row105_col8\" class=\"data row105 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row106_col0\" class=\"data row106 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", + " <td id=\"T_0502a_row106_col1\" class=\"data row106 col1\" >Adjusted Mutual Information</td>\n", + " <td id=\"T_0502a_row106_col2\" class=\"data row106 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", + " <td id=\"T_0502a_row106_col3\" class=\"data row106 col3\" >False</td>\n", + " <td id=\"T_0502a_row106_col4\" class=\"data row106 col4\" >True</td>\n", + " <td id=\"T_0502a_row106_col5\" class=\"data row106 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row106_col6\" class=\"data row106 col6\" >{}</td>\n", + " <td id=\"T_0502a_row106_col7\" class=\"data row106 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row106_col8\" class=\"data row106 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row107_col0\" class=\"data row107 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", + " <td id=\"T_0502a_row107_col1\" class=\"data row107 col1\" >Adjusted Rand Index</td>\n", + " <td id=\"T_0502a_row107_col2\" class=\"data row107 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", + " <td id=\"T_0502a_row107_col3\" class=\"data row107 col3\" >False</td>\n", + " <td id=\"T_0502a_row107_col4\" class=\"data row107 col4\" >True</td>\n", + " <td id=\"T_0502a_row107_col5\" class=\"data row107 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row107_col6\" class=\"data row107 col6\" >{}</td>\n", + " <td id=\"T_0502a_row107_col7\" class=\"data row107 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row107_col8\" class=\"data row107 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row108_col0\" class=\"data row108 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_0502a_row108_col1\" class=\"data row108 col1\" >Calibration Curve</td>\n", + " <td id=\"T_0502a_row108_col2\" class=\"data row108 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_0502a_row108_col3\" class=\"data row108 col3\" >True</td>\n", + " <td id=\"T_0502a_row108_col4\" class=\"data row108 col4\" >False</td>\n", + " <td id=\"T_0502a_row108_col5\" class=\"data row108 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row108_col6\" class=\"data row108 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row108_col7\" class=\"data row108 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_0502a_row108_col8\" class=\"data row108 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row109_col0\" class=\"data row109 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_0502a_row109_col1\" class=\"data row109 col1\" >Classifier Performance</td>\n", + " <td id=\"T_0502a_row109_col2\" class=\"data row109 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_0502a_row109_col3\" class=\"data row109 col3\" >False</td>\n", + " <td id=\"T_0502a_row109_col4\" class=\"data row109 col4\" >True</td>\n", + " <td id=\"T_0502a_row109_col5\" class=\"data row109 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row109_col6\" class=\"data row109 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_0502a_row109_col7\" class=\"data row109 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row109_col8\" class=\"data row109 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row110_col0\" class=\"data row110 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_0502a_row110_col1\" class=\"data row110 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_0502a_row110_col2\" class=\"data row110 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_0502a_row110_col3\" class=\"data row110 col3\" >False</td>\n", + " <td id=\"T_0502a_row110_col4\" class=\"data row110 col4\" >True</td>\n", + " <td id=\"T_0502a_row110_col5\" class=\"data row110 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row110_col6\" class=\"data row110 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row110_col7\" class=\"data row110 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_0502a_row110_col8\" class=\"data row110 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row111_col0\" class=\"data row111 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", + " <td id=\"T_0502a_row111_col1\" class=\"data row111 col1\" >Cluster Cosine Similarity</td>\n", + " <td id=\"T_0502a_row111_col2\" class=\"data row111 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", + " <td id=\"T_0502a_row111_col3\" class=\"data row111 col3\" >False</td>\n", + " <td id=\"T_0502a_row111_col4\" class=\"data row111 col4\" >True</td>\n", + " <td id=\"T_0502a_row111_col5\" class=\"data row111 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row111_col6\" class=\"data row111 col6\" >{}</td>\n", + " <td id=\"T_0502a_row111_col7\" class=\"data row111 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row111_col8\" class=\"data row111 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row112_col0\" class=\"data row112 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", + " <td id=\"T_0502a_row112_col1\" class=\"data row112 col1\" >Cluster Performance Metrics</td>\n", + " <td id=\"T_0502a_row112_col2\" class=\"data row112 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", + " <td id=\"T_0502a_row112_col3\" class=\"data row112 col3\" >False</td>\n", + " <td id=\"T_0502a_row112_col4\" class=\"data row112 col4\" >True</td>\n", + " <td id=\"T_0502a_row112_col5\" class=\"data row112 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row112_col6\" class=\"data row112 col6\" >{}</td>\n", + " <td id=\"T_0502a_row112_col7\" class=\"data row112 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row112_col8\" class=\"data row112 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row113_col0\" class=\"data row113 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", + " <td id=\"T_0502a_row113_col1\" class=\"data row113 col1\" >Completeness Score</td>\n", + " <td id=\"T_0502a_row113_col2\" class=\"data row113 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", + " <td id=\"T_0502a_row113_col3\" class=\"data row113 col3\" >False</td>\n", + " <td id=\"T_0502a_row113_col4\" class=\"data row113 col4\" >True</td>\n", + " <td id=\"T_0502a_row113_col5\" class=\"data row113 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row113_col6\" class=\"data row113 col6\" >{}</td>\n", + " <td id=\"T_0502a_row113_col7\" class=\"data row113 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row113_col8\" class=\"data row113 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row114_col0\" class=\"data row114 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_0502a_row114_col1\" class=\"data row114 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_0502a_row114_col2\" class=\"data row114 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_0502a_row114_col3\" class=\"data row114 col3\" >True</td>\n", + " <td id=\"T_0502a_row114_col4\" class=\"data row114 col4\" >False</td>\n", + " <td id=\"T_0502a_row114_col5\" class=\"data row114 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row114_col6\" class=\"data row114 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row114_col7\" class=\"data row114 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row114_col8\" class=\"data row114 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row115_col0\" class=\"data row115 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", + " <td id=\"T_0502a_row115_col1\" class=\"data row115 col1\" >Feature Importance</td>\n", + " <td id=\"T_0502a_row115_col2\" class=\"data row115 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", + " <td id=\"T_0502a_row115_col3\" class=\"data row115 col3\" >False</td>\n", + " <td id=\"T_0502a_row115_col4\" class=\"data row115 col4\" >True</td>\n", + " <td id=\"T_0502a_row115_col5\" class=\"data row115 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row115_col6\" class=\"data row115 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row115_col7\" class=\"data row115 col7\" >['model_explainability', 'sklearn']</td>\n", + " <td id=\"T_0502a_row115_col8\" class=\"data row115 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row116_col0\" class=\"data row116 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", + " <td id=\"T_0502a_row116_col1\" class=\"data row116 col1\" >Fowlkes Mallows Score</td>\n", + " <td id=\"T_0502a_row116_col2\" class=\"data row116 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", + " <td id=\"T_0502a_row116_col3\" class=\"data row116 col3\" >False</td>\n", + " <td id=\"T_0502a_row116_col4\" class=\"data row116 col4\" >True</td>\n", + " <td id=\"T_0502a_row116_col5\" class=\"data row116 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row116_col6\" class=\"data row116 col6\" >{}</td>\n", + " <td id=\"T_0502a_row116_col7\" class=\"data row116 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row116_col8\" class=\"data row116 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row117_col0\" class=\"data row117 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", + " <td id=\"T_0502a_row117_col1\" class=\"data row117 col1\" >Homogeneity Score</td>\n", + " <td id=\"T_0502a_row117_col2\" class=\"data row117 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", + " <td id=\"T_0502a_row117_col3\" class=\"data row117 col3\" >False</td>\n", + " <td id=\"T_0502a_row117_col4\" class=\"data row117 col4\" >True</td>\n", + " <td id=\"T_0502a_row117_col5\" class=\"data row117 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row117_col6\" class=\"data row117 col6\" >{}</td>\n", + " <td id=\"T_0502a_row117_col7\" class=\"data row117 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row117_col8\" class=\"data row117 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row118_col0\" class=\"data row118 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_0502a_row118_col1\" class=\"data row118 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_0502a_row118_col2\" class=\"data row118 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_0502a_row118_col3\" class=\"data row118 col3\" >False</td>\n", + " <td id=\"T_0502a_row118_col4\" class=\"data row118 col4\" >True</td>\n", + " <td id=\"T_0502a_row118_col5\" class=\"data row118 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row118_col6\" class=\"data row118 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_0502a_row118_col7\" class=\"data row118 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row118_col8\" class=\"data row118 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row119_col0\" class=\"data row119 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " <td id=\"T_0502a_row119_col1\" class=\"data row119 col1\" >K Means Clusters Optimization</td>\n", + " <td id=\"T_0502a_row119_col2\" class=\"data row119 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", + " <td id=\"T_0502a_row119_col3\" class=\"data row119 col3\" >True</td>\n", + " <td id=\"T_0502a_row119_col4\" class=\"data row119 col4\" >False</td>\n", + " <td id=\"T_0502a_row119_col5\" class=\"data row119 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row119_col6\" class=\"data row119 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row119_col7\" class=\"data row119 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", + " <td id=\"T_0502a_row119_col8\" class=\"data row119 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row120_col0\" class=\"data row120 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_0502a_row120_col1\" class=\"data row120 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_0502a_row120_col2\" class=\"data row120 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_0502a_row120_col3\" class=\"data row120 col3\" >False</td>\n", + " <td id=\"T_0502a_row120_col4\" class=\"data row120 col4\" >True</td>\n", + " <td id=\"T_0502a_row120_col5\" class=\"data row120 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row120_col6\" class=\"data row120 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row120_col7\" class=\"data row120 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row120_col8\" class=\"data row120 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row121_col0\" class=\"data row121 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_0502a_row121_col1\" class=\"data row121 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_0502a_row121_col2\" class=\"data row121 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_0502a_row121_col3\" class=\"data row121 col3\" >False</td>\n", + " <td id=\"T_0502a_row121_col4\" class=\"data row121 col4\" >True</td>\n", + " <td id=\"T_0502a_row121_col5\" class=\"data row121 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row121_col6\" class=\"data row121 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row121_col7\" class=\"data row121 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row121_col8\" class=\"data row121 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row122_col0\" class=\"data row122 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_0502a_row122_col1\" class=\"data row122 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_0502a_row122_col2\" class=\"data row122 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_0502a_row122_col3\" class=\"data row122 col3\" >False</td>\n", + " <td id=\"T_0502a_row122_col4\" class=\"data row122 col4\" >True</td>\n", + " <td id=\"T_0502a_row122_col5\" class=\"data row122 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row122_col6\" class=\"data row122 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row122_col7\" class=\"data row122 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row122_col8\" class=\"data row122 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row123_col0\" class=\"data row123 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_0502a_row123_col1\" class=\"data row123 col1\" >Model Parameters</td>\n", + " <td id=\"T_0502a_row123_col2\" class=\"data row123 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_0502a_row123_col3\" class=\"data row123 col3\" >False</td>\n", + " <td id=\"T_0502a_row123_col4\" class=\"data row123 col4\" >True</td>\n", + " <td id=\"T_0502a_row123_col5\" class=\"data row123 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row123_col6\" class=\"data row123 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row123_col7\" class=\"data row123 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_0502a_row123_col8\" class=\"data row123 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row124_col0\" class=\"data row124 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_0502a_row124_col1\" class=\"data row124 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_0502a_row124_col2\" class=\"data row124 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_0502a_row124_col3\" class=\"data row124 col3\" >False</td>\n", + " <td id=\"T_0502a_row124_col4\" class=\"data row124 col4\" >True</td>\n", + " <td id=\"T_0502a_row124_col5\" class=\"data row124 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row124_col6\" class=\"data row124 col6\" >{}</td>\n", + " <td id=\"T_0502a_row124_col7\" class=\"data row124 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_0502a_row124_col8\" class=\"data row124 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row125_col0\" class=\"data row125 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_0502a_row125_col1\" class=\"data row125 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_0502a_row125_col2\" class=\"data row125 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_0502a_row125_col3\" class=\"data row125 col3\" >True</td>\n", + " <td id=\"T_0502a_row125_col4\" class=\"data row125 col4\" >True</td>\n", + " <td id=\"T_0502a_row125_col5\" class=\"data row125 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_0502a_row125_col6\" class=\"data row125 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_0502a_row125_col7\" class=\"data row125 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_0502a_row125_col8\" class=\"data row125 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row126_col0\" class=\"data row126 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_0502a_row126_col1\" class=\"data row126 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_0502a_row126_col2\" class=\"data row126 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_0502a_row126_col3\" class=\"data row126 col3\" >True</td>\n", + " <td id=\"T_0502a_row126_col4\" class=\"data row126 col4\" >False</td>\n", + " <td id=\"T_0502a_row126_col5\" class=\"data row126 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row126_col6\" class=\"data row126 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row126_col7\" class=\"data row126 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row126_col8\" class=\"data row126 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row127_col0\" class=\"data row127 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_0502a_row127_col1\" class=\"data row127 col1\" >Population Stability Index</td>\n", + " <td id=\"T_0502a_row127_col2\" class=\"data row127 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_0502a_row127_col3\" class=\"data row127 col3\" >True</td>\n", + " <td id=\"T_0502a_row127_col4\" class=\"data row127 col4\" >True</td>\n", + " <td id=\"T_0502a_row127_col5\" class=\"data row127 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row127_col6\" class=\"data row127 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_0502a_row127_col7\" class=\"data row127 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row127_col8\" class=\"data row127 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row128_col0\" class=\"data row128 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_0502a_row128_col1\" class=\"data row128 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_0502a_row128_col2\" class=\"data row128 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_0502a_row128_col3\" class=\"data row128 col3\" >True</td>\n", + " <td id=\"T_0502a_row128_col4\" class=\"data row128 col4\" >False</td>\n", + " <td id=\"T_0502a_row128_col5\" class=\"data row128 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row128_col6\" class=\"data row128 col6\" >{}</td>\n", + " <td id=\"T_0502a_row128_col7\" class=\"data row128 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row128_col8\" class=\"data row128 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row129_col0\" class=\"data row129 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_0502a_row129_col1\" class=\"data row129 col1\" >ROC Curve</td>\n", + " <td id=\"T_0502a_row129_col2\" class=\"data row129 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_0502a_row129_col3\" class=\"data row129 col3\" >True</td>\n", + " <td id=\"T_0502a_row129_col4\" class=\"data row129 col4\" >False</td>\n", + " <td id=\"T_0502a_row129_col5\" class=\"data row129 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row129_col6\" class=\"data row129 col6\" >{}</td>\n", + " <td id=\"T_0502a_row129_col7\" class=\"data row129 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row129_col8\" class=\"data row129 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row130_col0\" class=\"data row130 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_0502a_row130_col1\" class=\"data row130 col1\" >Regression Errors</td>\n", + " <td id=\"T_0502a_row130_col2\" class=\"data row130 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_0502a_row130_col3\" class=\"data row130 col3\" >False</td>\n", + " <td id=\"T_0502a_row130_col4\" class=\"data row130 col4\" >True</td>\n", + " <td id=\"T_0502a_row130_col5\" class=\"data row130 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row130_col6\" class=\"data row130 col6\" >{}</td>\n", + " <td id=\"T_0502a_row130_col7\" class=\"data row130 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row130_col8\" class=\"data row130 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row131_col0\" class=\"data row131 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", + " <td id=\"T_0502a_row131_col1\" class=\"data row131 col1\" >Regression Errors Comparison</td>\n", + " <td id=\"T_0502a_row131_col2\" class=\"data row131 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", + " <td id=\"T_0502a_row131_col3\" class=\"data row131 col3\" >False</td>\n", + " <td id=\"T_0502a_row131_col4\" class=\"data row131 col4\" >True</td>\n", + " <td id=\"T_0502a_row131_col5\" class=\"data row131 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_0502a_row131_col6\" class=\"data row131 col6\" >{}</td>\n", + " <td id=\"T_0502a_row131_col7\" class=\"data row131 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row131_col8\" class=\"data row131 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row132_col0\" class=\"data row132 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", + " <td id=\"T_0502a_row132_col1\" class=\"data row132 col1\" >Regression Performance</td>\n", + " <td id=\"T_0502a_row132_col2\" class=\"data row132 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", + " <td id=\"T_0502a_row132_col3\" class=\"data row132 col3\" >False</td>\n", + " <td id=\"T_0502a_row132_col4\" class=\"data row132 col4\" >True</td>\n", + " <td id=\"T_0502a_row132_col5\" class=\"data row132 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row132_col6\" class=\"data row132 col6\" >{}</td>\n", + " <td id=\"T_0502a_row132_col7\" class=\"data row132 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row132_col8\" class=\"data row132 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row133_col0\" class=\"data row133 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " <td id=\"T_0502a_row133_col1\" class=\"data row133 col1\" >Regression R2 Square</td>\n", + " <td id=\"T_0502a_row133_col2\" class=\"data row133 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", + " <td id=\"T_0502a_row133_col3\" class=\"data row133 col3\" >False</td>\n", + " <td id=\"T_0502a_row133_col4\" class=\"data row133 col4\" >True</td>\n", + " <td id=\"T_0502a_row133_col5\" class=\"data row133 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row133_col6\" class=\"data row133 col6\" >{}</td>\n", + " <td id=\"T_0502a_row133_col7\" class=\"data row133 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row133_col8\" class=\"data row133 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row134_col0\" class=\"data row134 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", + " <td id=\"T_0502a_row134_col1\" class=\"data row134 col1\" >Regression R2 Square Comparison</td>\n", + " <td id=\"T_0502a_row134_col2\" class=\"data row134 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", + " <td id=\"T_0502a_row134_col3\" class=\"data row134 col3\" >False</td>\n", + " <td id=\"T_0502a_row134_col4\" class=\"data row134 col4\" >True</td>\n", + " <td id=\"T_0502a_row134_col5\" class=\"data row134 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_0502a_row134_col6\" class=\"data row134 col6\" >{}</td>\n", + " <td id=\"T_0502a_row134_col7\" class=\"data row134 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row134_col8\" class=\"data row134 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row135_col0\" class=\"data row135 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_0502a_row135_col1\" class=\"data row135 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_0502a_row135_col2\" class=\"data row135 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_0502a_row135_col3\" class=\"data row135 col3\" >True</td>\n", + " <td id=\"T_0502a_row135_col4\" class=\"data row135 col4\" >True</td>\n", + " <td id=\"T_0502a_row135_col5\" class=\"data row135 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row135_col6\" class=\"data row135 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row135_col7\" class=\"data row135 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_0502a_row135_col8\" class=\"data row135 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row136_col0\" class=\"data row136 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_0502a_row136_col1\" class=\"data row136 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_0502a_row136_col2\" class=\"data row136 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_0502a_row136_col3\" class=\"data row136 col3\" >False</td>\n", + " <td id=\"T_0502a_row136_col4\" class=\"data row136 col4\" >True</td>\n", + " <td id=\"T_0502a_row136_col5\" class=\"data row136 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row136_col6\" class=\"data row136 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row136_col7\" class=\"data row136 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row136_col8\" class=\"data row136 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row137_col0\" class=\"data row137 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_0502a_row137_col1\" class=\"data row137 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_0502a_row137_col2\" class=\"data row137 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_0502a_row137_col3\" class=\"data row137 col3\" >True</td>\n", + " <td id=\"T_0502a_row137_col4\" class=\"data row137 col4\" >True</td>\n", + " <td id=\"T_0502a_row137_col5\" class=\"data row137 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row137_col6\" class=\"data row137 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row137_col7\" class=\"data row137 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_0502a_row137_col8\" class=\"data row137 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row138_col0\" class=\"data row138 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " <td id=\"T_0502a_row138_col1\" class=\"data row138 col1\" >Silhouette Plot</td>\n", + " <td id=\"T_0502a_row138_col2\" class=\"data row138 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", + " <td id=\"T_0502a_row138_col3\" class=\"data row138 col3\" >True</td>\n", + " <td id=\"T_0502a_row138_col4\" class=\"data row138 col4\" >True</td>\n", + " <td id=\"T_0502a_row138_col5\" class=\"data row138 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row138_col6\" class=\"data row138 col6\" >{}</td>\n", + " <td id=\"T_0502a_row138_col7\" class=\"data row138 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row138_col8\" class=\"data row138 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row139_col0\" class=\"data row139 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_0502a_row139_col1\" class=\"data row139 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_0502a_row139_col2\" class=\"data row139 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_0502a_row139_col3\" class=\"data row139 col3\" >False</td>\n", + " <td id=\"T_0502a_row139_col4\" class=\"data row139 col4\" >True</td>\n", + " <td id=\"T_0502a_row139_col5\" class=\"data row139 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row139_col6\" class=\"data row139 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_0502a_row139_col7\" class=\"data row139 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row139_col8\" class=\"data row139 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row140_col0\" class=\"data row140 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", + " <td id=\"T_0502a_row140_col1\" class=\"data row140 col1\" >V Measure</td>\n", + " <td id=\"T_0502a_row140_col2\" class=\"data row140 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", + " <td id=\"T_0502a_row140_col3\" class=\"data row140 col3\" >False</td>\n", + " <td id=\"T_0502a_row140_col4\" class=\"data row140 col4\" >True</td>\n", + " <td id=\"T_0502a_row140_col5\" class=\"data row140 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row140_col6\" class=\"data row140 col6\" >{}</td>\n", + " <td id=\"T_0502a_row140_col7\" class=\"data row140 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row140_col8\" class=\"data row140 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row141_col0\" class=\"data row141 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_0502a_row141_col1\" class=\"data row141 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_0502a_row141_col2\" class=\"data row141 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_0502a_row141_col3\" class=\"data row141 col3\" >True</td>\n", + " <td id=\"T_0502a_row141_col4\" class=\"data row141 col4\" >True</td>\n", + " <td id=\"T_0502a_row141_col5\" class=\"data row141 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row141_col6\" class=\"data row141 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row141_col7\" class=\"data row141 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_0502a_row141_col8\" class=\"data row141 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row142_col0\" class=\"data row142 col0\" >validmind.model_validation.statsmodels.AutoARIMA</td>\n", + " <td id=\"T_0502a_row142_col1\" class=\"data row142 col1\" >Auto ARIMA</td>\n", + " <td id=\"T_0502a_row142_col2\" class=\"data row142 col2\" >Evaluates ARIMA models for time-series forecasting, ranking them using Bayesian and Akaike Information Criteria....</td>\n", + " <td id=\"T_0502a_row142_col3\" class=\"data row142 col3\" >False</td>\n", + " <td id=\"T_0502a_row142_col4\" class=\"data row142 col4\" >True</td>\n", + " <td id=\"T_0502a_row142_col5\" class=\"data row142 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row142_col6\" class=\"data row142 col6\" >{}</td>\n", + " <td id=\"T_0502a_row142_col7\" class=\"data row142 col7\" >['time_series_data', 'forecasting', 'model_selection', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row142_col8\" class=\"data row142 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row143_col0\" class=\"data row143 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", + " <td id=\"T_0502a_row143_col1\" class=\"data row143 col1\" >Cumulative Prediction Probabilities</td>\n", + " <td id=\"T_0502a_row143_col2\" class=\"data row143 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", + " <td id=\"T_0502a_row143_col3\" class=\"data row143 col3\" >True</td>\n", + " <td id=\"T_0502a_row143_col4\" class=\"data row143 col4\" >False</td>\n", + " <td id=\"T_0502a_row143_col5\" class=\"data row143 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row143_col6\" class=\"data row143 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", + " <td id=\"T_0502a_row143_col7\" class=\"data row143 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row143_col8\" class=\"data row143 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row144_col0\" class=\"data row144 col0\" >validmind.model_validation.statsmodels.DurbinWatsonTest</td>\n", + " <td id=\"T_0502a_row144_col1\" class=\"data row144 col1\" >Durbin Watson Test</td>\n", + " <td id=\"T_0502a_row144_col2\" class=\"data row144 col2\" >Assesses autocorrelation in time series data features using the Durbin-Watson statistic....</td>\n", + " <td id=\"T_0502a_row144_col3\" class=\"data row144 col3\" >False</td>\n", + " <td id=\"T_0502a_row144_col4\" class=\"data row144 col4\" >True</td>\n", + " <td id=\"T_0502a_row144_col5\" class=\"data row144 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row144_col6\" class=\"data row144 col6\" >{'threshold': {'type': None, 'default': [1.5, 2.5]}}</td>\n", + " <td id=\"T_0502a_row144_col7\" class=\"data row144 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row144_col8\" class=\"data row144 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row145_col0\" class=\"data row145 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", + " <td id=\"T_0502a_row145_col1\" class=\"data row145 col1\" >GINI Table</td>\n", + " <td id=\"T_0502a_row145_col2\" class=\"data row145 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", + " <td id=\"T_0502a_row145_col3\" class=\"data row145 col3\" >False</td>\n", + " <td id=\"T_0502a_row145_col4\" class=\"data row145 col4\" >True</td>\n", + " <td id=\"T_0502a_row145_col5\" class=\"data row145 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row145_col6\" class=\"data row145 col6\" >{}</td>\n", + " <td id=\"T_0502a_row145_col7\" class=\"data row145 col7\" >['model_performance']</td>\n", + " <td id=\"T_0502a_row145_col8\" class=\"data row145 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row146_col0\" class=\"data row146 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", + " <td id=\"T_0502a_row146_col1\" class=\"data row146 col1\" >Kolmogorov Smirnov</td>\n", + " <td id=\"T_0502a_row146_col2\" class=\"data row146 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", + " <td id=\"T_0502a_row146_col3\" class=\"data row146 col3\" >False</td>\n", + " <td id=\"T_0502a_row146_col4\" class=\"data row146 col4\" >True</td>\n", + " <td id=\"T_0502a_row146_col5\" class=\"data row146 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row146_col6\" class=\"data row146 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", + " <td id=\"T_0502a_row146_col7\" class=\"data row146 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row146_col8\" class=\"data row146 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row147_col0\" class=\"data row147 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", + " <td id=\"T_0502a_row147_col1\" class=\"data row147 col1\" >Lilliefors</td>\n", + " <td id=\"T_0502a_row147_col2\" class=\"data row147 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", + " <td id=\"T_0502a_row147_col3\" class=\"data row147 col3\" >False</td>\n", + " <td id=\"T_0502a_row147_col4\" class=\"data row147 col4\" >True</td>\n", + " <td id=\"T_0502a_row147_col5\" class=\"data row147 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row147_col6\" class=\"data row147 col6\" >{}</td>\n", + " <td id=\"T_0502a_row147_col7\" class=\"data row147 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row147_col8\" class=\"data row147 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row148_col0\" class=\"data row148 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", + " <td id=\"T_0502a_row148_col1\" class=\"data row148 col1\" >Prediction Probabilities Histogram</td>\n", + " <td id=\"T_0502a_row148_col2\" class=\"data row148 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", + " <td id=\"T_0502a_row148_col3\" class=\"data row148 col3\" >True</td>\n", + " <td id=\"T_0502a_row148_col4\" class=\"data row148 col4\" >False</td>\n", + " <td id=\"T_0502a_row148_col5\" class=\"data row148 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row148_col6\" class=\"data row148 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", + " <td id=\"T_0502a_row148_col7\" class=\"data row148 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row148_col8\" class=\"data row148 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row149_col0\" class=\"data row149 col0\" >validmind.model_validation.statsmodels.RegressionCoeffs</td>\n", + " <td id=\"T_0502a_row149_col1\" class=\"data row149 col1\" >Regression Coeffs</td>\n", + " <td id=\"T_0502a_row149_col2\" class=\"data row149 col2\" >Assesses the significance and uncertainty of predictor variables in a regression model through visualization of...</td>\n", + " <td id=\"T_0502a_row149_col3\" class=\"data row149 col3\" >True</td>\n", + " <td id=\"T_0502a_row149_col4\" class=\"data row149 col4\" >True</td>\n", + " <td id=\"T_0502a_row149_col5\" class=\"data row149 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row149_col6\" class=\"data row149 col6\" >{}</td>\n", + " <td id=\"T_0502a_row149_col7\" class=\"data row149 col7\" >['tabular_data', 'visualization', 'model_training']</td>\n", + " <td id=\"T_0502a_row149_col8\" class=\"data row149 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row150_col0\" class=\"data row150 col0\" >validmind.model_validation.statsmodels.RegressionFeatureSignificance</td>\n", + " <td id=\"T_0502a_row150_col1\" class=\"data row150 col1\" >Regression Feature Significance</td>\n", + " <td id=\"T_0502a_row150_col2\" class=\"data row150 col2\" >Assesses and visualizes the statistical significance of features in a regression model....</td>\n", + " <td id=\"T_0502a_row150_col3\" class=\"data row150 col3\" >True</td>\n", + " <td id=\"T_0502a_row150_col4\" class=\"data row150 col4\" >False</td>\n", + " <td id=\"T_0502a_row150_col5\" class=\"data row150 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row150_col6\" class=\"data row150 col6\" >{'fontsize': {'type': 'int', 'default': 10}, 'p_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row150_col7\" class=\"data row150 col7\" >['statistical_test', 'model_interpretation', 'visualization', 'feature_importance']</td>\n", + " <td id=\"T_0502a_row150_col8\" class=\"data row150 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row151_col0\" class=\"data row151 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlot</td>\n", + " <td id=\"T_0502a_row151_col1\" class=\"data row151 col1\" >Regression Model Forecast Plot</td>\n", + " <td id=\"T_0502a_row151_col2\" class=\"data row151 col2\" >Generates plots to visually compare the forecasted outcomes of a regression model against actual observed values over...</td>\n", + " <td id=\"T_0502a_row151_col3\" class=\"data row151 col3\" >True</td>\n", + " <td id=\"T_0502a_row151_col4\" class=\"data row151 col4\" >False</td>\n", + " <td id=\"T_0502a_row151_col5\" class=\"data row151 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row151_col6\" class=\"data row151 col6\" >{'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row151_col7\" class=\"data row151 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", + " <td id=\"T_0502a_row151_col8\" class=\"data row151 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row152_col0\" class=\"data row152 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels</td>\n", + " <td id=\"T_0502a_row152_col1\" class=\"data row152 col1\" >Regression Model Forecast Plot Levels</td>\n", + " <td id=\"T_0502a_row152_col2\" class=\"data row152 col2\" >Assesses the alignment between forecasted and observed values in regression models through visual plots...</td>\n", + " <td id=\"T_0502a_row152_col3\" class=\"data row152 col3\" >True</td>\n", + " <td id=\"T_0502a_row152_col4\" class=\"data row152 col4\" >False</td>\n", + " <td id=\"T_0502a_row152_col5\" class=\"data row152 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row152_col6\" class=\"data row152 col6\" >{}</td>\n", + " <td id=\"T_0502a_row152_col7\" class=\"data row152 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", + " <td id=\"T_0502a_row152_col8\" class=\"data row152 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row153_col0\" class=\"data row153 col0\" >validmind.model_validation.statsmodels.RegressionModelSensitivityPlot</td>\n", + " <td id=\"T_0502a_row153_col1\" class=\"data row153 col1\" >Regression Model Sensitivity Plot</td>\n", + " <td id=\"T_0502a_row153_col2\" class=\"data row153 col2\" >Assesses the sensitivity of a regression model to changes in independent variables by applying shocks and...</td>\n", + " <td id=\"T_0502a_row153_col3\" class=\"data row153 col3\" >True</td>\n", + " <td id=\"T_0502a_row153_col4\" class=\"data row153 col4\" >False</td>\n", + " <td id=\"T_0502a_row153_col5\" class=\"data row153 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row153_col6\" class=\"data row153 col6\" >{'shocks': {'type': None, 'default': [0.1]}, 'transformation': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row153_col7\" class=\"data row153 col7\" >['senstivity_analysis', 'visualization']</td>\n", + " <td id=\"T_0502a_row153_col8\" class=\"data row153 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row154_col0\" class=\"data row154 col0\" >validmind.model_validation.statsmodels.RegressionModelSummary</td>\n", + " <td id=\"T_0502a_row154_col1\" class=\"data row154 col1\" >Regression Model Summary</td>\n", + " <td id=\"T_0502a_row154_col2\" class=\"data row154 col2\" >Evaluates regression model performance using metrics including R-Squared, Adjusted R-Squared, MSE, and RMSE....</td>\n", + " <td id=\"T_0502a_row154_col3\" class=\"data row154 col3\" >False</td>\n", + " <td id=\"T_0502a_row154_col4\" class=\"data row154 col4\" >True</td>\n", + " <td id=\"T_0502a_row154_col5\" class=\"data row154 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row154_col6\" class=\"data row154 col6\" >{}</td>\n", + " <td id=\"T_0502a_row154_col7\" class=\"data row154 col7\" >['model_performance', 'regression']</td>\n", + " <td id=\"T_0502a_row154_col8\" class=\"data row154 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row155_col0\" class=\"data row155 col0\" >validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance</td>\n", + " <td id=\"T_0502a_row155_col1\" class=\"data row155 col1\" >Regression Permutation Feature Importance</td>\n", + " <td id=\"T_0502a_row155_col2\" class=\"data row155 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_0502a_row155_col3\" class=\"data row155 col3\" >True</td>\n", + " <td id=\"T_0502a_row155_col4\" class=\"data row155 col4\" >False</td>\n", + " <td id=\"T_0502a_row155_col5\" class=\"data row155 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row155_col6\" class=\"data row155 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row155_col7\" class=\"data row155 col7\" >['statsmodels', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row155_col8\" class=\"data row155 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row156_col0\" class=\"data row156 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", + " <td id=\"T_0502a_row156_col1\" class=\"data row156 col1\" >Scorecard Histogram</td>\n", + " <td id=\"T_0502a_row156_col2\" class=\"data row156 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", + " <td id=\"T_0502a_row156_col3\" class=\"data row156 col3\" >True</td>\n", + " <td id=\"T_0502a_row156_col4\" class=\"data row156 col4\" >False</td>\n", + " <td id=\"T_0502a_row156_col5\" class=\"data row156 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row156_col6\" class=\"data row156 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", + " <td id=\"T_0502a_row156_col7\" class=\"data row156 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_0502a_row156_col8\" class=\"data row156 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row157_col0\" class=\"data row157 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_0502a_row157_col1\" class=\"data row157 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_0502a_row157_col2\" class=\"data row157 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row157_col3\" class=\"data row157 col3\" >True</td>\n", + " <td id=\"T_0502a_row157_col4\" class=\"data row157 col4\" >True</td>\n", + " <td id=\"T_0502a_row157_col5\" class=\"data row157 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row157_col6\" class=\"data row157 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row157_col7\" class=\"data row157 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row157_col8\" class=\"data row157 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row158_col0\" class=\"data row158 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_0502a_row158_col1\" class=\"data row158 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_0502a_row158_col2\" class=\"data row158 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row158_col3\" class=\"data row158 col3\" >False</td>\n", + " <td id=\"T_0502a_row158_col4\" class=\"data row158 col4\" >True</td>\n", + " <td id=\"T_0502a_row158_col5\" class=\"data row158 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row158_col6\" class=\"data row158 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row158_col7\" class=\"data row158 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row158_col8\" class=\"data row158 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row159_col0\" class=\"data row159 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", + " <td id=\"T_0502a_row159_col1\" class=\"data row159 col1\" >Class Imbalance Drift</td>\n", + " <td id=\"T_0502a_row159_col2\" class=\"data row159 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row159_col3\" class=\"data row159 col3\" >True</td>\n", + " <td id=\"T_0502a_row159_col4\" class=\"data row159 col4\" >True</td>\n", + " <td id=\"T_0502a_row159_col5\" class=\"data row159 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row159_col6\" class=\"data row159 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", + " <td id=\"T_0502a_row159_col7\" class=\"data row159 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", + " <td id=\"T_0502a_row159_col8\" class=\"data row159 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row160_col0\" class=\"data row160 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_0502a_row160_col1\" class=\"data row160 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_0502a_row160_col2\" class=\"data row160 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row160_col3\" class=\"data row160 col3\" >False</td>\n", + " <td id=\"T_0502a_row160_col4\" class=\"data row160 col4\" >True</td>\n", + " <td id=\"T_0502a_row160_col5\" class=\"data row160 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row160_col6\" class=\"data row160 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row160_col7\" class=\"data row160 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row160_col8\" class=\"data row160 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row161_col0\" class=\"data row161 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_0502a_row161_col1\" class=\"data row161 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_0502a_row161_col2\" class=\"data row161 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row161_col3\" class=\"data row161 col3\" >False</td>\n", + " <td id=\"T_0502a_row161_col4\" class=\"data row161 col4\" >True</td>\n", + " <td id=\"T_0502a_row161_col5\" class=\"data row161 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row161_col6\" class=\"data row161 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row161_col7\" class=\"data row161 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row161_col8\" class=\"data row161 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row162_col0\" class=\"data row162 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", + " <td id=\"T_0502a_row162_col1\" class=\"data row162 col1\" >Cumulative Prediction Probabilities Drift</td>\n", + " <td id=\"T_0502a_row162_col2\" class=\"data row162 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row162_col3\" class=\"data row162 col3\" >True</td>\n", + " <td id=\"T_0502a_row162_col4\" class=\"data row162 col4\" >False</td>\n", + " <td id=\"T_0502a_row162_col5\" class=\"data row162 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row162_col6\" class=\"data row162 col6\" >{}</td>\n", + " <td id=\"T_0502a_row162_col7\" class=\"data row162 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row162_col8\" class=\"data row162 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row163_col0\" class=\"data row163 col0\" >validmind.ongoing_monitoring.FeatureDrift</td>\n", + " <td id=\"T_0502a_row163_col1\" class=\"data row163 col1\" >Feature Drift</td>\n", + " <td id=\"T_0502a_row163_col2\" class=\"data row163 col2\" >Evaluates changes in feature distribution over time to identify potential model drift....</td>\n", + " <td id=\"T_0502a_row163_col3\" class=\"data row163 col3\" >True</td>\n", + " <td id=\"T_0502a_row163_col4\" class=\"data row163 col4\" >True</td>\n", + " <td id=\"T_0502a_row163_col5\" class=\"data row163 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row163_col6\" class=\"data row163 col6\" >{'bins': {'type': '_empty', 'default': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]}, 'feature_columns': {'type': '_empty', 'default': None}, 'psi_threshold': {'type': '_empty', 'default': 0.2}}</td>\n", + " <td id=\"T_0502a_row163_col7\" class=\"data row163 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row163_col8\" class=\"data row163 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row164_col0\" class=\"data row164 col0\" >validmind.ongoing_monitoring.PredictionAcrossEachFeature</td>\n", + " <td id=\"T_0502a_row164_col1\" class=\"data row164 col1\" >Prediction Across Each Feature</td>\n", + " <td id=\"T_0502a_row164_col2\" class=\"data row164 col2\" >Assesses differences in model predictions across individual features between reference and monitoring datasets...</td>\n", + " <td id=\"T_0502a_row164_col3\" class=\"data row164 col3\" >True</td>\n", + " <td id=\"T_0502a_row164_col4\" class=\"data row164 col4\" >False</td>\n", + " <td id=\"T_0502a_row164_col5\" class=\"data row164 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row164_col6\" class=\"data row164 col6\" >{}</td>\n", + " <td id=\"T_0502a_row164_col7\" class=\"data row164 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row164_col8\" class=\"data row164 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row165_col0\" class=\"data row165 col0\" >validmind.ongoing_monitoring.PredictionCorrelation</td>\n", + " <td id=\"T_0502a_row165_col1\" class=\"data row165 col1\" >Prediction Correlation</td>\n", + " <td id=\"T_0502a_row165_col2\" class=\"data row165 col2\" >Assesses correlation changes between model predictions from reference and monitoring datasets to detect potential...</td>\n", + " <td id=\"T_0502a_row165_col3\" class=\"data row165 col3\" >True</td>\n", + " <td id=\"T_0502a_row165_col4\" class=\"data row165 col4\" >True</td>\n", + " <td id=\"T_0502a_row165_col5\" class=\"data row165 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row165_col6\" class=\"data row165 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row165_col7\" class=\"data row165 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row165_col8\" class=\"data row165 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row166_col0\" class=\"data row166 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", + " <td id=\"T_0502a_row166_col1\" class=\"data row166 col1\" >Prediction Probabilities Histogram Drift</td>\n", + " <td id=\"T_0502a_row166_col2\" class=\"data row166 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row166_col3\" class=\"data row166 col3\" >True</td>\n", + " <td id=\"T_0502a_row166_col4\" class=\"data row166 col4\" >True</td>\n", + " <td id=\"T_0502a_row166_col5\" class=\"data row166 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row166_col6\" class=\"data row166 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row166_col7\" class=\"data row166 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row166_col8\" class=\"data row166 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row167_col0\" class=\"data row167 col0\" >validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures</td>\n", + " <td id=\"T_0502a_row167_col1\" class=\"data row167 col1\" >Prediction Quantiles Across Features</td>\n", + " <td id=\"T_0502a_row167_col2\" class=\"data row167 col2\" >Assesses differences in model prediction distributions across individual features between reference...</td>\n", + " <td id=\"T_0502a_row167_col3\" class=\"data row167 col3\" >True</td>\n", + " <td id=\"T_0502a_row167_col4\" class=\"data row167 col4\" >False</td>\n", + " <td id=\"T_0502a_row167_col5\" class=\"data row167 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row167_col6\" class=\"data row167 col6\" >{}</td>\n", + " <td id=\"T_0502a_row167_col7\" class=\"data row167 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row167_col8\" class=\"data row167 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row168_col0\" class=\"data row168 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_0502a_row168_col1\" class=\"data row168 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_0502a_row168_col2\" class=\"data row168 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row168_col3\" class=\"data row168 col3\" >True</td>\n", + " <td id=\"T_0502a_row168_col4\" class=\"data row168 col4\" >False</td>\n", + " <td id=\"T_0502a_row168_col5\" class=\"data row168 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row168_col6\" class=\"data row168 col6\" >{}</td>\n", + " <td id=\"T_0502a_row168_col7\" class=\"data row168 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row168_col8\" class=\"data row168 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row169_col0\" class=\"data row169 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", + " <td id=\"T_0502a_row169_col1\" class=\"data row169 col1\" >Score Bands Drift</td>\n", + " <td id=\"T_0502a_row169_col2\" class=\"data row169 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", + " <td id=\"T_0502a_row169_col3\" class=\"data row169 col3\" >False</td>\n", + " <td id=\"T_0502a_row169_col4\" class=\"data row169 col4\" >True</td>\n", + " <td id=\"T_0502a_row169_col5\" class=\"data row169 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row169_col6\" class=\"data row169 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row169_col7\" class=\"data row169 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_0502a_row169_col8\" class=\"data row169 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row170_col0\" class=\"data row170 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", + " <td id=\"T_0502a_row170_col1\" class=\"data row170 col1\" >Scorecard Histogram Drift</td>\n", + " <td id=\"T_0502a_row170_col2\" class=\"data row170 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", + " <td id=\"T_0502a_row170_col3\" class=\"data row170 col3\" >True</td>\n", + " <td id=\"T_0502a_row170_col4\" class=\"data row170 col4\" >True</td>\n", + " <td id=\"T_0502a_row170_col5\" class=\"data row170 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row170_col6\" class=\"data row170 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row170_col7\" class=\"data row170 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_0502a_row170_col8\" class=\"data row170 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row171_col0\" class=\"data row171 col0\" >validmind.ongoing_monitoring.TargetPredictionDistributionPlot</td>\n", + " <td id=\"T_0502a_row171_col1\" class=\"data row171 col1\" >Target Prediction Distribution Plot</td>\n", + " <td id=\"T_0502a_row171_col2\" class=\"data row171 col2\" >Assesses differences in prediction distributions between a reference dataset and a monitoring dataset to identify...</td>\n", + " <td id=\"T_0502a_row171_col3\" class=\"data row171 col3\" >True</td>\n", + " <td id=\"T_0502a_row171_col4\" class=\"data row171 col4\" >True</td>\n", + " <td id=\"T_0502a_row171_col5\" class=\"data row171 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row171_col6\" class=\"data row171 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row171_col7\" class=\"data row171 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row171_col8\" class=\"data row171 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row172_col0\" class=\"data row172 col0\" >validmind.prompt_validation.Bias</td>\n", + " <td id=\"T_0502a_row172_col1\" class=\"data row172 col1\" >Bias</td>\n", + " <td id=\"T_0502a_row172_col2\" class=\"data row172 col2\" >Assesses potential bias in a Large Language Model by analyzing the distribution and order of exemplars in the...</td>\n", + " <td id=\"T_0502a_row172_col3\" class=\"data row172 col3\" >False</td>\n", + " <td id=\"T_0502a_row172_col4\" class=\"data row172 col4\" >True</td>\n", + " <td id=\"T_0502a_row172_col5\" class=\"data row172 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row172_col6\" class=\"data row172 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row172_col7\" class=\"data row172 col7\" >['llm', 'few_shot']</td>\n", + " <td id=\"T_0502a_row172_col8\" class=\"data row172 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row173_col0\" class=\"data row173 col0\" >validmind.prompt_validation.Clarity</td>\n", + " <td id=\"T_0502a_row173_col1\" class=\"data row173 col1\" >Clarity</td>\n", + " <td id=\"T_0502a_row173_col2\" class=\"data row173 col2\" >Evaluates and scores the clarity of prompts in a Large Language Model based on specified guidelines....</td>\n", + " <td id=\"T_0502a_row173_col3\" class=\"data row173 col3\" >False</td>\n", + " <td id=\"T_0502a_row173_col4\" class=\"data row173 col4\" >True</td>\n", + " <td id=\"T_0502a_row173_col5\" class=\"data row173 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row173_col6\" class=\"data row173 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row173_col7\" class=\"data row173 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row173_col8\" class=\"data row173 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row174_col0\" class=\"data row174 col0\" >validmind.prompt_validation.Conciseness</td>\n", + " <td id=\"T_0502a_row174_col1\" class=\"data row174 col1\" >Conciseness</td>\n", + " <td id=\"T_0502a_row174_col2\" class=\"data row174 col2\" >Analyzes and grades the conciseness of prompts provided to a Large Language Model....</td>\n", + " <td id=\"T_0502a_row174_col3\" class=\"data row174 col3\" >False</td>\n", + " <td id=\"T_0502a_row174_col4\" class=\"data row174 col4\" >True</td>\n", + " <td id=\"T_0502a_row174_col5\" class=\"data row174 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row174_col6\" class=\"data row174 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row174_col7\" class=\"data row174 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row174_col8\" class=\"data row174 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row175_col0\" class=\"data row175 col0\" >validmind.prompt_validation.Delimitation</td>\n", + " <td id=\"T_0502a_row175_col1\" class=\"data row175 col1\" >Delimitation</td>\n", + " <td id=\"T_0502a_row175_col2\" class=\"data row175 col2\" >Evaluates the proper use of delimiters in prompts provided to Large Language Models....</td>\n", + " <td id=\"T_0502a_row175_col3\" class=\"data row175 col3\" >False</td>\n", + " <td id=\"T_0502a_row175_col4\" class=\"data row175 col4\" >True</td>\n", + " <td id=\"T_0502a_row175_col5\" class=\"data row175 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row175_col6\" class=\"data row175 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row175_col7\" class=\"data row175 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row175_col8\" class=\"data row175 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row176_col0\" class=\"data row176 col0\" >validmind.prompt_validation.NegativeInstruction</td>\n", + " <td id=\"T_0502a_row176_col1\" class=\"data row176 col1\" >Negative Instruction</td>\n", + " <td id=\"T_0502a_row176_col2\" class=\"data row176 col2\" >Evaluates and grades the use of affirmative, proactive language over negative instructions in LLM prompts....</td>\n", + " <td id=\"T_0502a_row176_col3\" class=\"data row176 col3\" >False</td>\n", + " <td id=\"T_0502a_row176_col4\" class=\"data row176 col4\" >True</td>\n", + " <td id=\"T_0502a_row176_col5\" class=\"data row176 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row176_col6\" class=\"data row176 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row176_col7\" class=\"data row176 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row176_col8\" class=\"data row176 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row177_col0\" class=\"data row177 col0\" >validmind.prompt_validation.Robustness</td>\n", + " <td id=\"T_0502a_row177_col1\" class=\"data row177 col1\" >Robustness</td>\n", + " <td id=\"T_0502a_row177_col2\" class=\"data row177 col2\" >Assesses the robustness of prompts provided to a Large Language Model under varying conditions and contexts. This test...</td>\n", + " <td id=\"T_0502a_row177_col3\" class=\"data row177 col3\" >False</td>\n", + " <td id=\"T_0502a_row177_col4\" class=\"data row177 col4\" >True</td>\n", + " <td id=\"T_0502a_row177_col5\" class=\"data row177 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row177_col6\" class=\"data row177 col6\" >{'num_tests': {'type': '_empty', 'default': 10}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row177_col7\" class=\"data row177 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row177_col8\" class=\"data row177 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row178_col0\" class=\"data row178 col0\" >validmind.prompt_validation.Specificity</td>\n", + " <td id=\"T_0502a_row178_col1\" class=\"data row178 col1\" >Specificity</td>\n", + " <td id=\"T_0502a_row178_col2\" class=\"data row178 col2\" >Evaluates and scores the specificity of prompts provided to a Large Language Model (LLM), based on clarity, detail,...</td>\n", + " <td id=\"T_0502a_row178_col3\" class=\"data row178 col3\" >False</td>\n", + " <td id=\"T_0502a_row178_col4\" class=\"data row178 col4\" >True</td>\n", + " <td id=\"T_0502a_row178_col5\" class=\"data row178 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row178_col6\" class=\"data row178 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row178_col7\" class=\"data row178 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row178_col8\" class=\"data row178 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row179_col0\" class=\"data row179 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", + " <td id=\"T_0502a_row179_col1\" class=\"data row179 col1\" >Accuracy</td>\n", + " <td id=\"T_0502a_row179_col2\" class=\"data row179 col2\" >Calculates the accuracy of a model</td>\n", + " <td id=\"T_0502a_row179_col3\" class=\"data row179 col3\" >False</td>\n", + " <td id=\"T_0502a_row179_col4\" class=\"data row179 col4\" >False</td>\n", + " <td id=\"T_0502a_row179_col5\" class=\"data row179 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row179_col6\" class=\"data row179 col6\" >{}</td>\n", + " <td id=\"T_0502a_row179_col7\" class=\"data row179 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row179_col8\" class=\"data row179 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row180_col0\" class=\"data row180 col0\" >validmind.unit_metrics.classification.F1</td>\n", + " <td id=\"T_0502a_row180_col1\" class=\"data row180 col1\" >F1</td>\n", + " <td id=\"T_0502a_row180_col2\" class=\"data row180 col2\" >Calculates the F1 score for a classification model.</td>\n", + " <td id=\"T_0502a_row180_col3\" class=\"data row180 col3\" >False</td>\n", + " <td id=\"T_0502a_row180_col4\" class=\"data row180 col4\" >False</td>\n", + " <td id=\"T_0502a_row180_col5\" class=\"data row180 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row180_col6\" class=\"data row180 col6\" >{}</td>\n", + " <td id=\"T_0502a_row180_col7\" class=\"data row180 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row180_col8\" class=\"data row180 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row181_col0\" class=\"data row181 col0\" >validmind.unit_metrics.classification.Precision</td>\n", + " <td id=\"T_0502a_row181_col1\" class=\"data row181 col1\" >Precision</td>\n", + " <td id=\"T_0502a_row181_col2\" class=\"data row181 col2\" >Calculates the precision for a classification model.</td>\n", + " <td id=\"T_0502a_row181_col3\" class=\"data row181 col3\" >False</td>\n", + " <td id=\"T_0502a_row181_col4\" class=\"data row181 col4\" >False</td>\n", + " <td id=\"T_0502a_row181_col5\" class=\"data row181 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row181_col6\" class=\"data row181 col6\" >{}</td>\n", + " <td id=\"T_0502a_row181_col7\" class=\"data row181 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row181_col8\" class=\"data row181 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row182_col0\" class=\"data row182 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", + " <td id=\"T_0502a_row182_col1\" class=\"data row182 col1\" >ROC AUC</td>\n", + " <td id=\"T_0502a_row182_col2\" class=\"data row182 col2\" >Calculates the ROC AUC for a classification model.</td>\n", + " <td id=\"T_0502a_row182_col3\" class=\"data row182 col3\" >False</td>\n", + " <td id=\"T_0502a_row182_col4\" class=\"data row182 col4\" >False</td>\n", + " <td id=\"T_0502a_row182_col5\" class=\"data row182 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row182_col6\" class=\"data row182 col6\" >{}</td>\n", + " <td id=\"T_0502a_row182_col7\" class=\"data row182 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row182_col8\" class=\"data row182 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row183_col0\" class=\"data row183 col0\" >validmind.unit_metrics.classification.Recall</td>\n", + " <td id=\"T_0502a_row183_col1\" class=\"data row183 col1\" >Recall</td>\n", + " <td id=\"T_0502a_row183_col2\" class=\"data row183 col2\" >Calculates the recall for a classification model.</td>\n", + " <td id=\"T_0502a_row183_col3\" class=\"data row183 col3\" >False</td>\n", + " <td id=\"T_0502a_row183_col4\" class=\"data row183 col4\" >False</td>\n", + " <td id=\"T_0502a_row183_col5\" class=\"data row183 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row183_col6\" class=\"data row183 col6\" >{}</td>\n", + " <td id=\"T_0502a_row183_col7\" class=\"data row183 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row183_col8\" class=\"data row183 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row184_col0\" class=\"data row184 col0\" >validmind.unit_metrics.regression.AdjustedRSquaredScore</td>\n", + " <td id=\"T_0502a_row184_col1\" class=\"data row184 col1\" >Adjusted R Squared Score</td>\n", + " <td id=\"T_0502a_row184_col2\" class=\"data row184 col2\" >Calculates the adjusted R-squared score for a regression model.</td>\n", + " <td id=\"T_0502a_row184_col3\" class=\"data row184 col3\" >False</td>\n", + " <td id=\"T_0502a_row184_col4\" class=\"data row184 col4\" >False</td>\n", + " <td id=\"T_0502a_row184_col5\" class=\"data row184 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row184_col6\" class=\"data row184 col6\" >{}</td>\n", + " <td id=\"T_0502a_row184_col7\" class=\"data row184 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row184_col8\" class=\"data row184 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row185_col0\" class=\"data row185 col0\" >validmind.unit_metrics.regression.GiniCoefficient</td>\n", + " <td id=\"T_0502a_row185_col1\" class=\"data row185 col1\" >Gini Coefficient</td>\n", + " <td id=\"T_0502a_row185_col2\" class=\"data row185 col2\" >Calculates the Gini coefficient for a regression model.</td>\n", + " <td id=\"T_0502a_row185_col3\" class=\"data row185 col3\" >False</td>\n", + " <td id=\"T_0502a_row185_col4\" class=\"data row185 col4\" >False</td>\n", + " <td id=\"T_0502a_row185_col5\" class=\"data row185 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row185_col6\" class=\"data row185 col6\" >{}</td>\n", + " <td id=\"T_0502a_row185_col7\" class=\"data row185 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row185_col8\" class=\"data row185 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row186_col0\" class=\"data row186 col0\" >validmind.unit_metrics.regression.HuberLoss</td>\n", + " <td id=\"T_0502a_row186_col1\" class=\"data row186 col1\" >Huber Loss</td>\n", + " <td id=\"T_0502a_row186_col2\" class=\"data row186 col2\" >Calculates the Huber loss for a regression model.</td>\n", + " <td id=\"T_0502a_row186_col3\" class=\"data row186 col3\" >False</td>\n", + " <td id=\"T_0502a_row186_col4\" class=\"data row186 col4\" >False</td>\n", + " <td id=\"T_0502a_row186_col5\" class=\"data row186 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row186_col6\" class=\"data row186 col6\" >{}</td>\n", + " <td id=\"T_0502a_row186_col7\" class=\"data row186 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row186_col8\" class=\"data row186 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row187_col0\" class=\"data row187 col0\" >validmind.unit_metrics.regression.KolmogorovSmirnovStatistic</td>\n", + " <td id=\"T_0502a_row187_col1\" class=\"data row187 col1\" >Kolmogorov Smirnov Statistic</td>\n", + " <td id=\"T_0502a_row187_col2\" class=\"data row187 col2\" >Calculates the Kolmogorov-Smirnov statistic for a regression model.</td>\n", + " <td id=\"T_0502a_row187_col3\" class=\"data row187 col3\" >False</td>\n", + " <td id=\"T_0502a_row187_col4\" class=\"data row187 col4\" >False</td>\n", + " <td id=\"T_0502a_row187_col5\" class=\"data row187 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row187_col6\" class=\"data row187 col6\" >{}</td>\n", + " <td id=\"T_0502a_row187_col7\" class=\"data row187 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row187_col8\" class=\"data row187 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row188_col0\" class=\"data row188 col0\" >validmind.unit_metrics.regression.MeanAbsoluteError</td>\n", + " <td id=\"T_0502a_row188_col1\" class=\"data row188 col1\" >Mean Absolute Error</td>\n", + " <td id=\"T_0502a_row188_col2\" class=\"data row188 col2\" >Calculates the mean absolute error for a regression model.</td>\n", + " <td id=\"T_0502a_row188_col3\" class=\"data row188 col3\" >False</td>\n", + " <td id=\"T_0502a_row188_col4\" class=\"data row188 col4\" >False</td>\n", + " <td id=\"T_0502a_row188_col5\" class=\"data row188 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row188_col6\" class=\"data row188 col6\" >{}</td>\n", + " <td id=\"T_0502a_row188_col7\" class=\"data row188 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row188_col8\" class=\"data row188 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row189_col0\" class=\"data row189 col0\" >validmind.unit_metrics.regression.MeanAbsolutePercentageError</td>\n", + " <td id=\"T_0502a_row189_col1\" class=\"data row189 col1\" >Mean Absolute Percentage Error</td>\n", + " <td id=\"T_0502a_row189_col2\" class=\"data row189 col2\" >Calculates the mean absolute percentage error for a regression model.</td>\n", + " <td id=\"T_0502a_row189_col3\" class=\"data row189 col3\" >False</td>\n", + " <td id=\"T_0502a_row189_col4\" class=\"data row189 col4\" >False</td>\n", + " <td id=\"T_0502a_row189_col5\" class=\"data row189 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row189_col6\" class=\"data row189 col6\" >{}</td>\n", + " <td id=\"T_0502a_row189_col7\" class=\"data row189 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row189_col8\" class=\"data row189 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row190_col0\" class=\"data row190 col0\" >validmind.unit_metrics.regression.MeanBiasDeviation</td>\n", + " <td id=\"T_0502a_row190_col1\" class=\"data row190 col1\" >Mean Bias Deviation</td>\n", + " <td id=\"T_0502a_row190_col2\" class=\"data row190 col2\" >Calculates the mean bias deviation for a regression model.</td>\n", + " <td id=\"T_0502a_row190_col3\" class=\"data row190 col3\" >False</td>\n", + " <td id=\"T_0502a_row190_col4\" class=\"data row190 col4\" >False</td>\n", + " <td id=\"T_0502a_row190_col5\" class=\"data row190 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row190_col6\" class=\"data row190 col6\" >{}</td>\n", + " <td id=\"T_0502a_row190_col7\" class=\"data row190 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row190_col8\" class=\"data row190 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row191_col0\" class=\"data row191 col0\" >validmind.unit_metrics.regression.MeanSquaredError</td>\n", + " <td id=\"T_0502a_row191_col1\" class=\"data row191 col1\" >Mean Squared Error</td>\n", + " <td id=\"T_0502a_row191_col2\" class=\"data row191 col2\" >Calculates the mean squared error for a regression model.</td>\n", + " <td id=\"T_0502a_row191_col3\" class=\"data row191 col3\" >False</td>\n", + " <td id=\"T_0502a_row191_col4\" class=\"data row191 col4\" >False</td>\n", + " <td id=\"T_0502a_row191_col5\" class=\"data row191 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row191_col6\" class=\"data row191 col6\" >{}</td>\n", + " <td id=\"T_0502a_row191_col7\" class=\"data row191 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row191_col8\" class=\"data row191 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row192_col0\" class=\"data row192 col0\" >validmind.unit_metrics.regression.QuantileLoss</td>\n", + " <td id=\"T_0502a_row192_col1\" class=\"data row192 col1\" >Quantile Loss</td>\n", + " <td id=\"T_0502a_row192_col2\" class=\"data row192 col2\" >Calculates the quantile loss for a regression model.</td>\n", + " <td id=\"T_0502a_row192_col3\" class=\"data row192 col3\" >False</td>\n", + " <td id=\"T_0502a_row192_col4\" class=\"data row192 col4\" >False</td>\n", + " <td id=\"T_0502a_row192_col5\" class=\"data row192 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row192_col6\" class=\"data row192 col6\" >{'quantile': {'type': '_empty', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row192_col7\" class=\"data row192 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row192_col8\" class=\"data row192 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row193_col0\" class=\"data row193 col0\" >validmind.unit_metrics.regression.RSquaredScore</td>\n", + " <td id=\"T_0502a_row193_col1\" class=\"data row193 col1\" >R Squared Score</td>\n", + " <td id=\"T_0502a_row193_col2\" class=\"data row193 col2\" >Calculates the R-squared score for a regression model.</td>\n", + " <td id=\"T_0502a_row193_col3\" class=\"data row193 col3\" >False</td>\n", + " <td id=\"T_0502a_row193_col4\" class=\"data row193 col4\" >False</td>\n", + " <td id=\"T_0502a_row193_col5\" class=\"data row193 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row193_col6\" class=\"data row193 col6\" >{}</td>\n", + " <td id=\"T_0502a_row193_col7\" class=\"data row193 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row193_col8\" class=\"data row193 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row194_col0\" class=\"data row194 col0\" >validmind.unit_metrics.regression.RootMeanSquaredError</td>\n", + " <td id=\"T_0502a_row194_col1\" class=\"data row194 col1\" >Root Mean Squared Error</td>\n", + " <td id=\"T_0502a_row194_col2\" class=\"data row194 col2\" >Calculates the root mean squared error for a regression model.</td>\n", + " <td id=\"T_0502a_row194_col3\" class=\"data row194 col3\" >False</td>\n", + " <td id=\"T_0502a_row194_col4\" class=\"data row194 col4\" >False</td>\n", + " <td id=\"T_0502a_row194_col5\" class=\"data row194 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row194_col6\" class=\"data row194 col6\" >{}</td>\n", + " <td id=\"T_0502a_row194_col7\" class=\"data row194 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row194_col8\" class=\"data row194 col8\" >['regression']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x38000a670>" + ] + } + } ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Filter tests by tags and task types\n", - "\n", - "While listing all tests is useful, you’ll often want to narrow your search. The [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) function supports `filter`, `task`, and `tags` parameters to assist in refining your results.\n", - "\n", - "Use the `filter` parameter to find tests that match a specific keyword, such as `sklearn`:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Understand tags and task types\n", + "\n", + "Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.\n", + "\n", + "Understanding `task` types helps you filter tests that match your record's (such as a model) objective. For example:\n", + "\n", + "- **classification:** Works with Classification Models and Datasets.\n", + "- **regression:** Works with Regression Models and Datasets.\n", + "- **text classification:** Works with Text Classification Models and Datasets.\n", + "- **text summarization:** Works with Text Summarization Models and Datasets." + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_326c3 th {\n", - " text-align: left;\n", - "}\n", - "#T_326c3_row0_col0, #T_326c3_row0_col1, #T_326c3_row0_col2, #T_326c3_row0_col3, #T_326c3_row0_col4, #T_326c3_row0_col5, #T_326c3_row0_col6, #T_326c3_row0_col7, #T_326c3_row0_col8, #T_326c3_row1_col0, #T_326c3_row1_col1, #T_326c3_row1_col2, #T_326c3_row1_col3, #T_326c3_row1_col4, #T_326c3_row1_col5, #T_326c3_row1_col6, #T_326c3_row1_col7, #T_326c3_row1_col8, #T_326c3_row2_col0, #T_326c3_row2_col1, #T_326c3_row2_col2, #T_326c3_row2_col3, #T_326c3_row2_col4, #T_326c3_row2_col5, #T_326c3_row2_col6, #T_326c3_row2_col7, #T_326c3_row2_col8, #T_326c3_row3_col0, #T_326c3_row3_col1, #T_326c3_row3_col2, #T_326c3_row3_col3, #T_326c3_row3_col4, #T_326c3_row3_col5, #T_326c3_row3_col6, #T_326c3_row3_col7, #T_326c3_row3_col8, #T_326c3_row4_col0, #T_326c3_row4_col1, #T_326c3_row4_col2, #T_326c3_row4_col3, #T_326c3_row4_col4, #T_326c3_row4_col5, #T_326c3_row4_col6, #T_326c3_row4_col7, #T_326c3_row4_col8, #T_326c3_row5_col0, #T_326c3_row5_col1, #T_326c3_row5_col2, #T_326c3_row5_col3, #T_326c3_row5_col4, #T_326c3_row5_col5, #T_326c3_row5_col6, #T_326c3_row5_col7, #T_326c3_row5_col8, #T_326c3_row6_col0, #T_326c3_row6_col1, #T_326c3_row6_col2, #T_326c3_row6_col3, #T_326c3_row6_col4, #T_326c3_row6_col5, #T_326c3_row6_col6, #T_326c3_row6_col7, #T_326c3_row6_col8, #T_326c3_row7_col0, #T_326c3_row7_col1, #T_326c3_row7_col2, #T_326c3_row7_col3, #T_326c3_row7_col4, #T_326c3_row7_col5, #T_326c3_row7_col6, #T_326c3_row7_col7, #T_326c3_row7_col8, #T_326c3_row8_col0, #T_326c3_row8_col1, #T_326c3_row8_col2, #T_326c3_row8_col3, #T_326c3_row8_col4, #T_326c3_row8_col5, #T_326c3_row8_col6, #T_326c3_row8_col7, #T_326c3_row8_col8, #T_326c3_row9_col0, #T_326c3_row9_col1, #T_326c3_row9_col2, #T_326c3_row9_col3, #T_326c3_row9_col4, #T_326c3_row9_col5, #T_326c3_row9_col6, #T_326c3_row9_col7, #T_326c3_row9_col8, #T_326c3_row10_col0, #T_326c3_row10_col1, #T_326c3_row10_col2, #T_326c3_row10_col3, #T_326c3_row10_col4, #T_326c3_row10_col5, #T_326c3_row10_col6, #T_326c3_row10_col7, #T_326c3_row10_col8, #T_326c3_row11_col0, #T_326c3_row11_col1, #T_326c3_row11_col2, #T_326c3_row11_col3, #T_326c3_row11_col4, #T_326c3_row11_col5, #T_326c3_row11_col6, #T_326c3_row11_col7, #T_326c3_row11_col8, #T_326c3_row12_col0, #T_326c3_row12_col1, #T_326c3_row12_col2, #T_326c3_row12_col3, #T_326c3_row12_col4, #T_326c3_row12_col5, #T_326c3_row12_col6, #T_326c3_row12_col7, #T_326c3_row12_col8, #T_326c3_row13_col0, #T_326c3_row13_col1, #T_326c3_row13_col2, #T_326c3_row13_col3, #T_326c3_row13_col4, #T_326c3_row13_col5, #T_326c3_row13_col6, #T_326c3_row13_col7, #T_326c3_row13_col8, #T_326c3_row14_col0, #T_326c3_row14_col1, #T_326c3_row14_col2, #T_326c3_row14_col3, #T_326c3_row14_col4, #T_326c3_row14_col5, #T_326c3_row14_col6, #T_326c3_row14_col7, #T_326c3_row14_col8, #T_326c3_row15_col0, #T_326c3_row15_col1, #T_326c3_row15_col2, #T_326c3_row15_col3, #T_326c3_row15_col4, #T_326c3_row15_col5, #T_326c3_row15_col6, #T_326c3_row15_col7, #T_326c3_row15_col8, #T_326c3_row16_col0, #T_326c3_row16_col1, #T_326c3_row16_col2, #T_326c3_row16_col3, #T_326c3_row16_col4, #T_326c3_row16_col5, #T_326c3_row16_col6, #T_326c3_row16_col7, #T_326c3_row16_col8, #T_326c3_row17_col0, #T_326c3_row17_col1, #T_326c3_row17_col2, #T_326c3_row17_col3, #T_326c3_row17_col4, #T_326c3_row17_col5, #T_326c3_row17_col6, #T_326c3_row17_col7, #T_326c3_row17_col8, #T_326c3_row18_col0, #T_326c3_row18_col1, #T_326c3_row18_col2, #T_326c3_row18_col3, #T_326c3_row18_col4, #T_326c3_row18_col5, #T_326c3_row18_col6, #T_326c3_row18_col7, #T_326c3_row18_col8, #T_326c3_row19_col0, #T_326c3_row19_col1, #T_326c3_row19_col2, #T_326c3_row19_col3, #T_326c3_row19_col4, #T_326c3_row19_col5, #T_326c3_row19_col6, #T_326c3_row19_col7, #T_326c3_row19_col8, #T_326c3_row20_col0, #T_326c3_row20_col1, #T_326c3_row20_col2, #T_326c3_row20_col3, #T_326c3_row20_col4, #T_326c3_row20_col5, #T_326c3_row20_col6, #T_326c3_row20_col7, #T_326c3_row20_col8, #T_326c3_row21_col0, #T_326c3_row21_col1, #T_326c3_row21_col2, #T_326c3_row21_col3, #T_326c3_row21_col4, #T_326c3_row21_col5, #T_326c3_row21_col6, #T_326c3_row21_col7, #T_326c3_row21_col8, #T_326c3_row22_col0, #T_326c3_row22_col1, #T_326c3_row22_col2, #T_326c3_row22_col3, #T_326c3_row22_col4, #T_326c3_row22_col5, #T_326c3_row22_col6, #T_326c3_row22_col7, #T_326c3_row22_col8, #T_326c3_row23_col0, #T_326c3_row23_col1, #T_326c3_row23_col2, #T_326c3_row23_col3, #T_326c3_row23_col4, #T_326c3_row23_col5, #T_326c3_row23_col6, #T_326c3_row23_col7, #T_326c3_row23_col8, #T_326c3_row24_col0, #T_326c3_row24_col1, #T_326c3_row24_col2, #T_326c3_row24_col3, #T_326c3_row24_col4, #T_326c3_row24_col5, #T_326c3_row24_col6, #T_326c3_row24_col7, #T_326c3_row24_col8, #T_326c3_row25_col0, #T_326c3_row25_col1, #T_326c3_row25_col2, #T_326c3_row25_col3, #T_326c3_row25_col4, #T_326c3_row25_col5, #T_326c3_row25_col6, #T_326c3_row25_col7, #T_326c3_row25_col8, #T_326c3_row26_col0, #T_326c3_row26_col1, #T_326c3_row26_col2, #T_326c3_row26_col3, #T_326c3_row26_col4, #T_326c3_row26_col5, #T_326c3_row26_col6, #T_326c3_row26_col7, #T_326c3_row26_col8, #T_326c3_row27_col0, #T_326c3_row27_col1, #T_326c3_row27_col2, #T_326c3_row27_col3, #T_326c3_row27_col4, #T_326c3_row27_col5, #T_326c3_row27_col6, #T_326c3_row27_col7, #T_326c3_row27_col8, #T_326c3_row28_col0, #T_326c3_row28_col1, #T_326c3_row28_col2, #T_326c3_row28_col3, #T_326c3_row28_col4, #T_326c3_row28_col5, #T_326c3_row28_col6, #T_326c3_row28_col7, #T_326c3_row28_col8, #T_326c3_row29_col0, #T_326c3_row29_col1, #T_326c3_row29_col2, #T_326c3_row29_col3, #T_326c3_row29_col4, #T_326c3_row29_col5, #T_326c3_row29_col6, #T_326c3_row29_col7, #T_326c3_row29_col8, #T_326c3_row30_col0, #T_326c3_row30_col1, #T_326c3_row30_col2, #T_326c3_row30_col3, #T_326c3_row30_col4, #T_326c3_row30_col5, #T_326c3_row30_col6, #T_326c3_row30_col7, #T_326c3_row30_col8, #T_326c3_row31_col0, #T_326c3_row31_col1, #T_326c3_row31_col2, #T_326c3_row31_col3, #T_326c3_row31_col4, #T_326c3_row31_col5, #T_326c3_row31_col6, #T_326c3_row31_col7, #T_326c3_row31_col8, #T_326c3_row32_col0, #T_326c3_row32_col1, #T_326c3_row32_col2, #T_326c3_row32_col3, #T_326c3_row32_col4, #T_326c3_row32_col5, #T_326c3_row32_col6, #T_326c3_row32_col7, #T_326c3_row32_col8, #T_326c3_row33_col0, #T_326c3_row33_col1, #T_326c3_row33_col2, #T_326c3_row33_col3, #T_326c3_row33_col4, #T_326c3_row33_col5, #T_326c3_row33_col6, #T_326c3_row33_col7, #T_326c3_row33_col8, #T_326c3_row34_col0, #T_326c3_row34_col1, #T_326c3_row34_col2, #T_326c3_row34_col3, #T_326c3_row34_col4, #T_326c3_row34_col5, #T_326c3_row34_col6, #T_326c3_row34_col7, #T_326c3_row34_col8, #T_326c3_row35_col0, #T_326c3_row35_col1, #T_326c3_row35_col2, #T_326c3_row35_col3, #T_326c3_row35_col4, #T_326c3_row35_col5, #T_326c3_row35_col6, #T_326c3_row35_col7, #T_326c3_row35_col8, #T_326c3_row36_col0, #T_326c3_row36_col1, #T_326c3_row36_col2, #T_326c3_row36_col3, #T_326c3_row36_col4, #T_326c3_row36_col5, #T_326c3_row36_col6, #T_326c3_row36_col7, #T_326c3_row36_col8, #T_326c3_row37_col0, #T_326c3_row37_col1, #T_326c3_row37_col2, #T_326c3_row37_col3, #T_326c3_row37_col4, #T_326c3_row37_col5, #T_326c3_row37_col6, #T_326c3_row37_col7, #T_326c3_row37_col8, #T_326c3_row38_col0, #T_326c3_row38_col1, #T_326c3_row38_col2, #T_326c3_row38_col3, #T_326c3_row38_col4, #T_326c3_row38_col5, #T_326c3_row38_col6, #T_326c3_row38_col7, #T_326c3_row38_col8, #T_326c3_row39_col0, #T_326c3_row39_col1, #T_326c3_row39_col2, #T_326c3_row39_col3, #T_326c3_row39_col4, #T_326c3_row39_col5, #T_326c3_row39_col6, #T_326c3_row39_col7, #T_326c3_row39_col8, #T_326c3_row40_col0, #T_326c3_row40_col1, #T_326c3_row40_col2, #T_326c3_row40_col3, #T_326c3_row40_col4, #T_326c3_row40_col5, #T_326c3_row40_col6, #T_326c3_row40_col7, #T_326c3_row40_col8, #T_326c3_row41_col0, #T_326c3_row41_col1, #T_326c3_row41_col2, #T_326c3_row41_col3, #T_326c3_row41_col4, #T_326c3_row41_col5, #T_326c3_row41_col6, #T_326c3_row41_col7, #T_326c3_row41_col8, #T_326c3_row42_col0, #T_326c3_row42_col1, #T_326c3_row42_col2, #T_326c3_row42_col3, #T_326c3_row42_col4, #T_326c3_row42_col5, #T_326c3_row42_col6, #T_326c3_row42_col7, #T_326c3_row42_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_326c3\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_326c3_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_326c3_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_326c3_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_326c3_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_326c3_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_326c3_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_326c3_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_326c3_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_326c3_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_326c3_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " <td id=\"T_326c3_row0_col1\" class=\"data row0 col1\" >Cluster Size Distribution</td>\n", - " <td id=\"T_326c3_row0_col2\" class=\"data row0 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", - " <td id=\"T_326c3_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_326c3_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_326c3_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_326c3_row0_col7\" class=\"data row0 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row0_col8\" class=\"data row0 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", - " <td id=\"T_326c3_row1_col1\" class=\"data row1 col1\" >Time Series R2 Square By Segments</td>\n", - " <td id=\"T_326c3_row1_col2\" class=\"data row1 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", - " <td id=\"T_326c3_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_326c3_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_326c3_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row1_col6\" class=\"data row1 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row1_col7\" class=\"data row1 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row1_col8\" class=\"data row1 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", - " <td id=\"T_326c3_row2_col1\" class=\"data row2 col1\" >Adjusted Mutual Information</td>\n", - " <td id=\"T_326c3_row2_col2\" class=\"data row2 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", - " <td id=\"T_326c3_row2_col3\" class=\"data row2 col3\" >False</td>\n", - " <td id=\"T_326c3_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_326c3_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_326c3_row2_col7\" class=\"data row2 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row2_col8\" class=\"data row2 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", - " <td id=\"T_326c3_row3_col1\" class=\"data row3 col1\" >Adjusted Rand Index</td>\n", - " <td id=\"T_326c3_row3_col2\" class=\"data row3 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", - " <td id=\"T_326c3_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_326c3_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_326c3_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_326c3_row3_col7\" class=\"data row3 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row3_col8\" class=\"data row3 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_326c3_row4_col1\" class=\"data row4 col1\" >Calibration Curve</td>\n", - " <td id=\"T_326c3_row4_col2\" class=\"data row4 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_326c3_row4_col3\" class=\"data row4 col3\" >True</td>\n", - " <td id=\"T_326c3_row4_col4\" class=\"data row4 col4\" >False</td>\n", - " <td id=\"T_326c3_row4_col5\" class=\"data row4 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_326c3_row4_col7\" class=\"data row4 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_326c3_row4_col8\" class=\"data row4 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row5_col0\" class=\"data row5 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_326c3_row5_col1\" class=\"data row5 col1\" >Classifier Performance</td>\n", - " <td id=\"T_326c3_row5_col2\" class=\"data row5 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_326c3_row5_col3\" class=\"data row5 col3\" >False</td>\n", - " <td id=\"T_326c3_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_326c3_row5_col5\" class=\"data row5 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row5_col6\" class=\"data row5 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_326c3_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row6_col0\" class=\"data row6 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_326c3_row6_col1\" class=\"data row6 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_326c3_row6_col2\" class=\"data row6 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_326c3_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_326c3_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_326c3_row6_col5\" class=\"data row6 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row6_col6\" class=\"data row6 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row6_col7\" class=\"data row6 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_326c3_row6_col8\" class=\"data row6 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row7_col0\" class=\"data row7 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", - " <td id=\"T_326c3_row7_col1\" class=\"data row7 col1\" >Cluster Cosine Similarity</td>\n", - " <td id=\"T_326c3_row7_col2\" class=\"data row7 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", - " <td id=\"T_326c3_row7_col3\" class=\"data row7 col3\" >False</td>\n", - " <td id=\"T_326c3_row7_col4\" class=\"data row7 col4\" >True</td>\n", - " <td id=\"T_326c3_row7_col5\" class=\"data row7 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row7_col6\" class=\"data row7 col6\" >{}</td>\n", - " <td id=\"T_326c3_row7_col7\" class=\"data row7 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row7_col8\" class=\"data row7 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row8_col0\" class=\"data row8 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", - " <td id=\"T_326c3_row8_col1\" class=\"data row8 col1\" >Cluster Performance Metrics</td>\n", - " <td id=\"T_326c3_row8_col2\" class=\"data row8 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", - " <td id=\"T_326c3_row8_col3\" class=\"data row8 col3\" >False</td>\n", - " <td id=\"T_326c3_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_326c3_row8_col5\" class=\"data row8 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row8_col6\" class=\"data row8 col6\" >{}</td>\n", - " <td id=\"T_326c3_row8_col7\" class=\"data row8 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row8_col8\" class=\"data row8 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row9_col0\" class=\"data row9 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", - " <td id=\"T_326c3_row9_col1\" class=\"data row9 col1\" >Completeness Score</td>\n", - " <td id=\"T_326c3_row9_col2\" class=\"data row9 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", - " <td id=\"T_326c3_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_326c3_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_326c3_row9_col5\" class=\"data row9 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row9_col6\" class=\"data row9 col6\" >{}</td>\n", - " <td id=\"T_326c3_row9_col7\" class=\"data row9 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row9_col8\" class=\"data row9 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row10_col0\" class=\"data row10 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_326c3_row10_col1\" class=\"data row10 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_326c3_row10_col2\" class=\"data row10 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_326c3_row10_col3\" class=\"data row10 col3\" >True</td>\n", - " <td id=\"T_326c3_row10_col4\" class=\"data row10 col4\" >False</td>\n", - " <td id=\"T_326c3_row10_col5\" class=\"data row10 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row10_col7\" class=\"data row10 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row10_col8\" class=\"data row10 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row11_col0\" class=\"data row11 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", - " <td id=\"T_326c3_row11_col1\" class=\"data row11 col1\" >Feature Importance</td>\n", - " <td id=\"T_326c3_row11_col2\" class=\"data row11 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", - " <td id=\"T_326c3_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_326c3_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_326c3_row11_col5\" class=\"data row11 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row11_col6\" class=\"data row11 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_326c3_row11_col7\" class=\"data row11 col7\" >['model_explainability', 'sklearn']</td>\n", - " <td id=\"T_326c3_row11_col8\" class=\"data row11 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row12_col0\" class=\"data row12 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", - " <td id=\"T_326c3_row12_col1\" class=\"data row12 col1\" >Fowlkes Mallows Score</td>\n", - " <td id=\"T_326c3_row12_col2\" class=\"data row12 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", - " <td id=\"T_326c3_row12_col3\" class=\"data row12 col3\" >False</td>\n", - " <td id=\"T_326c3_row12_col4\" class=\"data row12 col4\" >True</td>\n", - " <td id=\"T_326c3_row12_col5\" class=\"data row12 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row12_col6\" class=\"data row12 col6\" >{}</td>\n", - " <td id=\"T_326c3_row12_col7\" class=\"data row12 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row12_col8\" class=\"data row12 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row13_col0\" class=\"data row13 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", - " <td id=\"T_326c3_row13_col1\" class=\"data row13 col1\" >Homogeneity Score</td>\n", - " <td id=\"T_326c3_row13_col2\" class=\"data row13 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", - " <td id=\"T_326c3_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_326c3_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_326c3_row13_col5\" class=\"data row13 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row13_col6\" class=\"data row13 col6\" >{}</td>\n", - " <td id=\"T_326c3_row13_col7\" class=\"data row13 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row13_col8\" class=\"data row13 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row14_col0\" class=\"data row14 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_326c3_row14_col1\" class=\"data row14 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_326c3_row14_col2\" class=\"data row14 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_326c3_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_326c3_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_326c3_row14_col5\" class=\"data row14 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row14_col6\" class=\"data row14 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_326c3_row14_col7\" class=\"data row14 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row14_col8\" class=\"data row14 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row15_col0\" class=\"data row15 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " <td id=\"T_326c3_row15_col1\" class=\"data row15 col1\" >K Means Clusters Optimization</td>\n", - " <td id=\"T_326c3_row15_col2\" class=\"data row15 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", - " <td id=\"T_326c3_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_326c3_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_326c3_row15_col5\" class=\"data row15 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row15_col6\" class=\"data row15 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row15_col7\" class=\"data row15 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", - " <td id=\"T_326c3_row15_col8\" class=\"data row15 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row16_col0\" class=\"data row16 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_326c3_row16_col1\" class=\"data row16 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_326c3_row16_col2\" class=\"data row16 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_326c3_row16_col3\" class=\"data row16 col3\" >False</td>\n", - " <td id=\"T_326c3_row16_col4\" class=\"data row16 col4\" >True</td>\n", - " <td id=\"T_326c3_row16_col5\" class=\"data row16 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_326c3_row16_col7\" class=\"data row16 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row16_col8\" class=\"data row16 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row17_col0\" class=\"data row17 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_326c3_row17_col1\" class=\"data row17 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_326c3_row17_col2\" class=\"data row17 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_326c3_row17_col3\" class=\"data row17 col3\" >False</td>\n", - " <td id=\"T_326c3_row17_col4\" class=\"data row17 col4\" >True</td>\n", - " <td id=\"T_326c3_row17_col5\" class=\"data row17 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row17_col6\" class=\"data row17 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row17_col7\" class=\"data row17 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row17_col8\" class=\"data row17 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row18_col0\" class=\"data row18 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_326c3_row18_col1\" class=\"data row18 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_326c3_row18_col2\" class=\"data row18 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_326c3_row18_col3\" class=\"data row18 col3\" >False</td>\n", - " <td id=\"T_326c3_row18_col4\" class=\"data row18 col4\" >True</td>\n", - " <td id=\"T_326c3_row18_col5\" class=\"data row18 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row18_col6\" class=\"data row18 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row18_col7\" class=\"data row18 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row18_col8\" class=\"data row18 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row19_col0\" class=\"data row19 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_326c3_row19_col1\" class=\"data row19 col1\" >Model Parameters</td>\n", - " <td id=\"T_326c3_row19_col2\" class=\"data row19 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_326c3_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_326c3_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_326c3_row19_col5\" class=\"data row19 col5\" >['model']</td>\n", - " <td id=\"T_326c3_row19_col6\" class=\"data row19 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row19_col7\" class=\"data row19 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_326c3_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row20_col0\" class=\"data row20 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_326c3_row20_col1\" class=\"data row20 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_326c3_row20_col2\" class=\"data row20 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_326c3_row20_col3\" class=\"data row20 col3\" >False</td>\n", - " <td id=\"T_326c3_row20_col4\" class=\"data row20 col4\" >True</td>\n", - " <td id=\"T_326c3_row20_col5\" class=\"data row20 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_326c3_row20_col6\" class=\"data row20 col6\" >{}</td>\n", - " <td id=\"T_326c3_row20_col7\" class=\"data row20 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_326c3_row20_col8\" class=\"data row20 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row21_col0\" class=\"data row21 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_326c3_row21_col1\" class=\"data row21 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_326c3_row21_col2\" class=\"data row21 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_326c3_row21_col3\" class=\"data row21 col3\" >True</td>\n", - " <td id=\"T_326c3_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_326c3_row21_col5\" class=\"data row21 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_326c3_row21_col6\" class=\"data row21 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_326c3_row21_col7\" class=\"data row21 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_326c3_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row22_col0\" class=\"data row22 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_326c3_row22_col1\" class=\"data row22 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_326c3_row22_col2\" class=\"data row22 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_326c3_row22_col3\" class=\"data row22 col3\" >True</td>\n", - " <td id=\"T_326c3_row22_col4\" class=\"data row22 col4\" >False</td>\n", - " <td id=\"T_326c3_row22_col5\" class=\"data row22 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row22_col6\" class=\"data row22 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row22_col7\" class=\"data row22 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_326c3_row22_col8\" class=\"data row22 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row23_col0\" class=\"data row23 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_326c3_row23_col1\" class=\"data row23 col1\" >Population Stability Index</td>\n", - " <td id=\"T_326c3_row23_col2\" class=\"data row23 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_326c3_row23_col3\" class=\"data row23 col3\" >True</td>\n", - " <td id=\"T_326c3_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_326c3_row23_col5\" class=\"data row23 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row23_col6\" class=\"data row23 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_326c3_row23_col7\" class=\"data row23 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row23_col8\" class=\"data row23 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row24_col0\" class=\"data row24 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_326c3_row24_col1\" class=\"data row24 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_326c3_row24_col2\" class=\"data row24 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_326c3_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_326c3_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_326c3_row24_col5\" class=\"data row24 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row24_col6\" class=\"data row24 col6\" >{}</td>\n", - " <td id=\"T_326c3_row24_col7\" class=\"data row24 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row24_col8\" class=\"data row24 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row25_col0\" class=\"data row25 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_326c3_row25_col1\" class=\"data row25 col1\" >ROC Curve</td>\n", - " <td id=\"T_326c3_row25_col2\" class=\"data row25 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_326c3_row25_col3\" class=\"data row25 col3\" >True</td>\n", - " <td id=\"T_326c3_row25_col4\" class=\"data row25 col4\" >False</td>\n", - " <td id=\"T_326c3_row25_col5\" class=\"data row25 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row25_col6\" class=\"data row25 col6\" >{}</td>\n", - " <td id=\"T_326c3_row25_col7\" class=\"data row25 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row25_col8\" class=\"data row25 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row26_col0\" class=\"data row26 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_326c3_row26_col1\" class=\"data row26 col1\" >Regression Errors</td>\n", - " <td id=\"T_326c3_row26_col2\" class=\"data row26 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_326c3_row26_col3\" class=\"data row26 col3\" >False</td>\n", - " <td id=\"T_326c3_row26_col4\" class=\"data row26 col4\" >True</td>\n", - " <td id=\"T_326c3_row26_col5\" class=\"data row26 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row26_col6\" class=\"data row26 col6\" >{}</td>\n", - " <td id=\"T_326c3_row26_col7\" class=\"data row26 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row26_col8\" class=\"data row26 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row27_col0\" class=\"data row27 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", - " <td id=\"T_326c3_row27_col1\" class=\"data row27 col1\" >Regression Errors Comparison</td>\n", - " <td id=\"T_326c3_row27_col2\" class=\"data row27 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", - " <td id=\"T_326c3_row27_col3\" class=\"data row27 col3\" >False</td>\n", - " <td id=\"T_326c3_row27_col4\" class=\"data row27 col4\" >True</td>\n", - " <td id=\"T_326c3_row27_col5\" class=\"data row27 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_326c3_row27_col6\" class=\"data row27 col6\" >{}</td>\n", - " <td id=\"T_326c3_row27_col7\" class=\"data row27 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row27_col8\" class=\"data row27 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row28_col0\" class=\"data row28 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", - " <td id=\"T_326c3_row28_col1\" class=\"data row28 col1\" >Regression Performance</td>\n", - " <td id=\"T_326c3_row28_col2\" class=\"data row28 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", - " <td id=\"T_326c3_row28_col3\" class=\"data row28 col3\" >False</td>\n", - " <td id=\"T_326c3_row28_col4\" class=\"data row28 col4\" >True</td>\n", - " <td id=\"T_326c3_row28_col5\" class=\"data row28 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_326c3_row28_col7\" class=\"data row28 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row28_col8\" class=\"data row28 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row29_col0\" class=\"data row29 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " <td id=\"T_326c3_row29_col1\" class=\"data row29 col1\" >Regression R2 Square</td>\n", - " <td id=\"T_326c3_row29_col2\" class=\"data row29 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", - " <td id=\"T_326c3_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_326c3_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_326c3_row29_col5\" class=\"data row29 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row29_col6\" class=\"data row29 col6\" >{}</td>\n", - " <td id=\"T_326c3_row29_col7\" class=\"data row29 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row30_col0\" class=\"data row30 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", - " <td id=\"T_326c3_row30_col1\" class=\"data row30 col1\" >Regression R2 Square Comparison</td>\n", - " <td id=\"T_326c3_row30_col2\" class=\"data row30 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", - " <td id=\"T_326c3_row30_col3\" class=\"data row30 col3\" >False</td>\n", - " <td id=\"T_326c3_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_326c3_row30_col5\" class=\"data row30 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_326c3_row30_col6\" class=\"data row30 col6\" >{}</td>\n", - " <td id=\"T_326c3_row30_col7\" class=\"data row30 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row30_col8\" class=\"data row30 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row31_col0\" class=\"data row31 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_326c3_row31_col1\" class=\"data row31 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_326c3_row31_col2\" class=\"data row31 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_326c3_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_326c3_row31_col4\" class=\"data row31 col4\" >True</td>\n", - " <td id=\"T_326c3_row31_col5\" class=\"data row31 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row31_col6\" class=\"data row31 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_326c3_row31_col7\" class=\"data row31 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_326c3_row31_col8\" class=\"data row31 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row32_col0\" class=\"data row32 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_326c3_row32_col1\" class=\"data row32 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_326c3_row32_col2\" class=\"data row32 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_326c3_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_326c3_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_326c3_row32_col5\" class=\"data row32 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row32_col6\" class=\"data row32 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row32_col7\" class=\"data row32 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_326c3_row32_col8\" class=\"data row32 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_326c3_row33_col1\" class=\"data row33 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_326c3_row33_col2\" class=\"data row33 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_326c3_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_326c3_row33_col4\" class=\"data row33 col4\" >True</td>\n", - " <td id=\"T_326c3_row33_col5\" class=\"data row33 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row33_col6\" class=\"data row33 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_326c3_row33_col7\" class=\"data row33 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_326c3_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " <td id=\"T_326c3_row34_col1\" class=\"data row34 col1\" >Silhouette Plot</td>\n", - " <td id=\"T_326c3_row34_col2\" class=\"data row34 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", - " <td id=\"T_326c3_row34_col3\" class=\"data row34 col3\" >True</td>\n", - " <td id=\"T_326c3_row34_col4\" class=\"data row34 col4\" >True</td>\n", - " <td id=\"T_326c3_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row34_col6\" class=\"data row34 col6\" >{}</td>\n", - " <td id=\"T_326c3_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row34_col8\" class=\"data row34 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_326c3_row35_col1\" class=\"data row35 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_326c3_row35_col2\" class=\"data row35 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_326c3_row35_col3\" class=\"data row35 col3\" >False</td>\n", - " <td id=\"T_326c3_row35_col4\" class=\"data row35 col4\" >True</td>\n", - " <td id=\"T_326c3_row35_col5\" class=\"data row35 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row35_col6\" class=\"data row35 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_326c3_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", - " <td id=\"T_326c3_row36_col1\" class=\"data row36 col1\" >V Measure</td>\n", - " <td id=\"T_326c3_row36_col2\" class=\"data row36 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", - " <td id=\"T_326c3_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_326c3_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_326c3_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row36_col6\" class=\"data row36 col6\" >{}</td>\n", - " <td id=\"T_326c3_row36_col7\" class=\"data row36 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row36_col8\" class=\"data row36 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_326c3_row37_col1\" class=\"data row37 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_326c3_row37_col2\" class=\"data row37 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_326c3_row37_col3\" class=\"data row37 col3\" >True</td>\n", - " <td id=\"T_326c3_row37_col4\" class=\"data row37 col4\" >True</td>\n", - " <td id=\"T_326c3_row37_col5\" class=\"data row37 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row37_col6\" class=\"data row37 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_326c3_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row38_col0\" class=\"data row38 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_326c3_row38_col1\" class=\"data row38 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_326c3_row38_col2\" class=\"data row38 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row38_col3\" class=\"data row38 col3\" >True</td>\n", - " <td id=\"T_326c3_row38_col4\" class=\"data row38 col4\" >True</td>\n", - " <td id=\"T_326c3_row38_col5\" class=\"data row38 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row38_col6\" class=\"data row38 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row38_col7\" class=\"data row38 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row38_col8\" class=\"data row38 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row39_col0\" class=\"data row39 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_326c3_row39_col1\" class=\"data row39 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_326c3_row39_col2\" class=\"data row39 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row39_col3\" class=\"data row39 col3\" >False</td>\n", - " <td id=\"T_326c3_row39_col4\" class=\"data row39 col4\" >True</td>\n", - " <td id=\"T_326c3_row39_col5\" class=\"data row39 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row39_col6\" class=\"data row39 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row40_col0\" class=\"data row40 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_326c3_row40_col1\" class=\"data row40 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_326c3_row40_col2\" class=\"data row40 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row40_col3\" class=\"data row40 col3\" >False</td>\n", - " <td id=\"T_326c3_row40_col4\" class=\"data row40 col4\" >True</td>\n", - " <td id=\"T_326c3_row40_col5\" class=\"data row40 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row40_col6\" class=\"data row40 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row41_col0\" class=\"data row41 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_326c3_row41_col1\" class=\"data row41 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_326c3_row41_col2\" class=\"data row41 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_326c3_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_326c3_row41_col5\" class=\"data row41 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row41_col6\" class=\"data row41 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row42_col0\" class=\"data row42 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_326c3_row42_col1\" class=\"data row42 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_326c3_row42_col2\" class=\"data row42 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row42_col3\" class=\"data row42 col3\" >True</td>\n", - " <td id=\"T_326c3_row42_col4\" class=\"data row42 col4\" >False</td>\n", - " <td id=\"T_326c3_row42_col5\" class=\"data row42 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row42_col6\" class=\"data row42 col6\" >{}</td>\n", - " <td id=\"T_326c3_row42_col7\" class=\"data row42 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row42_col8\" class=\"data row42 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tasks()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x1052e6790>" + "execution_count": 3, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "['text_qa',\n", + " 'classification',\n", + " 'data_validation',\n", + " 'text_classification',\n", + " 'feature_extraction',\n", + " 'regression',\n", + " 'visualization',\n", + " 'clustering',\n", + " 'time_series_forecasting',\n", + " 'text_summarization',\n", + " 'nlp',\n", + " 'residual_analysis',\n", + " 'monitoring',\n", + " 'text_generation']" + ] + } + } ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(filter=\"sklearn\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use the `task` parameter to find tests that match a specific task type, such as `classification`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_56dd5 th {\n", - " text-align: left;\n", - "}\n", - "#T_56dd5_row0_col0, #T_56dd5_row0_col1, #T_56dd5_row0_col2, #T_56dd5_row0_col3, #T_56dd5_row0_col4, #T_56dd5_row0_col5, #T_56dd5_row0_col6, #T_56dd5_row0_col7, #T_56dd5_row0_col8, #T_56dd5_row1_col0, #T_56dd5_row1_col1, #T_56dd5_row1_col2, #T_56dd5_row1_col3, #T_56dd5_row1_col4, #T_56dd5_row1_col5, #T_56dd5_row1_col6, #T_56dd5_row1_col7, #T_56dd5_row1_col8, #T_56dd5_row2_col0, #T_56dd5_row2_col1, #T_56dd5_row2_col2, #T_56dd5_row2_col3, #T_56dd5_row2_col4, #T_56dd5_row2_col5, #T_56dd5_row2_col6, #T_56dd5_row2_col7, #T_56dd5_row2_col8, #T_56dd5_row3_col0, #T_56dd5_row3_col1, #T_56dd5_row3_col2, #T_56dd5_row3_col3, #T_56dd5_row3_col4, #T_56dd5_row3_col5, #T_56dd5_row3_col6, #T_56dd5_row3_col7, #T_56dd5_row3_col8, #T_56dd5_row4_col0, #T_56dd5_row4_col1, #T_56dd5_row4_col2, #T_56dd5_row4_col3, #T_56dd5_row4_col4, #T_56dd5_row4_col5, #T_56dd5_row4_col6, #T_56dd5_row4_col7, #T_56dd5_row4_col8, #T_56dd5_row5_col0, #T_56dd5_row5_col1, #T_56dd5_row5_col2, #T_56dd5_row5_col3, #T_56dd5_row5_col4, #T_56dd5_row5_col5, #T_56dd5_row5_col6, #T_56dd5_row5_col7, #T_56dd5_row5_col8, #T_56dd5_row6_col0, #T_56dd5_row6_col1, #T_56dd5_row6_col2, #T_56dd5_row6_col3, #T_56dd5_row6_col4, #T_56dd5_row6_col5, #T_56dd5_row6_col6, #T_56dd5_row6_col7, #T_56dd5_row6_col8, #T_56dd5_row7_col0, #T_56dd5_row7_col1, #T_56dd5_row7_col2, #T_56dd5_row7_col3, #T_56dd5_row7_col4, #T_56dd5_row7_col5, #T_56dd5_row7_col6, #T_56dd5_row7_col7, #T_56dd5_row7_col8, #T_56dd5_row8_col0, #T_56dd5_row8_col1, #T_56dd5_row8_col2, #T_56dd5_row8_col3, #T_56dd5_row8_col4, #T_56dd5_row8_col5, #T_56dd5_row8_col6, #T_56dd5_row8_col7, #T_56dd5_row8_col8, #T_56dd5_row9_col0, #T_56dd5_row9_col1, #T_56dd5_row9_col2, #T_56dd5_row9_col3, #T_56dd5_row9_col4, #T_56dd5_row9_col5, #T_56dd5_row9_col6, #T_56dd5_row9_col7, #T_56dd5_row9_col8, #T_56dd5_row10_col0, #T_56dd5_row10_col1, #T_56dd5_row10_col2, #T_56dd5_row10_col3, #T_56dd5_row10_col4, #T_56dd5_row10_col5, #T_56dd5_row10_col6, #T_56dd5_row10_col7, #T_56dd5_row10_col8, #T_56dd5_row11_col0, #T_56dd5_row11_col1, #T_56dd5_row11_col2, #T_56dd5_row11_col3, #T_56dd5_row11_col4, #T_56dd5_row11_col5, #T_56dd5_row11_col6, #T_56dd5_row11_col7, #T_56dd5_row11_col8, #T_56dd5_row12_col0, #T_56dd5_row12_col1, #T_56dd5_row12_col2, #T_56dd5_row12_col3, #T_56dd5_row12_col4, #T_56dd5_row12_col5, #T_56dd5_row12_col6, #T_56dd5_row12_col7, #T_56dd5_row12_col8, #T_56dd5_row13_col0, #T_56dd5_row13_col1, #T_56dd5_row13_col2, #T_56dd5_row13_col3, #T_56dd5_row13_col4, #T_56dd5_row13_col5, #T_56dd5_row13_col6, #T_56dd5_row13_col7, #T_56dd5_row13_col8, #T_56dd5_row14_col0, #T_56dd5_row14_col1, #T_56dd5_row14_col2, #T_56dd5_row14_col3, #T_56dd5_row14_col4, #T_56dd5_row14_col5, #T_56dd5_row14_col6, #T_56dd5_row14_col7, #T_56dd5_row14_col8, #T_56dd5_row15_col0, #T_56dd5_row15_col1, #T_56dd5_row15_col2, #T_56dd5_row15_col3, #T_56dd5_row15_col4, #T_56dd5_row15_col5, #T_56dd5_row15_col6, #T_56dd5_row15_col7, #T_56dd5_row15_col8, #T_56dd5_row16_col0, #T_56dd5_row16_col1, #T_56dd5_row16_col2, #T_56dd5_row16_col3, #T_56dd5_row16_col4, #T_56dd5_row16_col5, #T_56dd5_row16_col6, #T_56dd5_row16_col7, #T_56dd5_row16_col8, #T_56dd5_row17_col0, #T_56dd5_row17_col1, #T_56dd5_row17_col2, #T_56dd5_row17_col3, #T_56dd5_row17_col4, #T_56dd5_row17_col5, #T_56dd5_row17_col6, #T_56dd5_row17_col7, #T_56dd5_row17_col8, #T_56dd5_row18_col0, #T_56dd5_row18_col1, #T_56dd5_row18_col2, #T_56dd5_row18_col3, #T_56dd5_row18_col4, #T_56dd5_row18_col5, #T_56dd5_row18_col6, #T_56dd5_row18_col7, #T_56dd5_row18_col8, #T_56dd5_row19_col0, #T_56dd5_row19_col1, #T_56dd5_row19_col2, #T_56dd5_row19_col3, #T_56dd5_row19_col4, #T_56dd5_row19_col5, #T_56dd5_row19_col6, #T_56dd5_row19_col7, #T_56dd5_row19_col8, #T_56dd5_row20_col0, #T_56dd5_row20_col1, #T_56dd5_row20_col2, #T_56dd5_row20_col3, #T_56dd5_row20_col4, #T_56dd5_row20_col5, #T_56dd5_row20_col6, #T_56dd5_row20_col7, #T_56dd5_row20_col8, #T_56dd5_row21_col0, #T_56dd5_row21_col1, #T_56dd5_row21_col2, #T_56dd5_row21_col3, #T_56dd5_row21_col4, #T_56dd5_row21_col5, #T_56dd5_row21_col6, #T_56dd5_row21_col7, #T_56dd5_row21_col8, #T_56dd5_row22_col0, #T_56dd5_row22_col1, #T_56dd5_row22_col2, #T_56dd5_row22_col3, #T_56dd5_row22_col4, #T_56dd5_row22_col5, #T_56dd5_row22_col6, #T_56dd5_row22_col7, #T_56dd5_row22_col8, #T_56dd5_row23_col0, #T_56dd5_row23_col1, #T_56dd5_row23_col2, #T_56dd5_row23_col3, #T_56dd5_row23_col4, #T_56dd5_row23_col5, #T_56dd5_row23_col6, #T_56dd5_row23_col7, #T_56dd5_row23_col8, #T_56dd5_row24_col0, #T_56dd5_row24_col1, #T_56dd5_row24_col2, #T_56dd5_row24_col3, #T_56dd5_row24_col4, #T_56dd5_row24_col5, #T_56dd5_row24_col6, #T_56dd5_row24_col7, #T_56dd5_row24_col8, #T_56dd5_row25_col0, #T_56dd5_row25_col1, #T_56dd5_row25_col2, #T_56dd5_row25_col3, #T_56dd5_row25_col4, #T_56dd5_row25_col5, #T_56dd5_row25_col6, #T_56dd5_row25_col7, #T_56dd5_row25_col8, #T_56dd5_row26_col0, #T_56dd5_row26_col1, #T_56dd5_row26_col2, #T_56dd5_row26_col3, #T_56dd5_row26_col4, #T_56dd5_row26_col5, #T_56dd5_row26_col6, #T_56dd5_row26_col7, #T_56dd5_row26_col8, #T_56dd5_row27_col0, #T_56dd5_row27_col1, #T_56dd5_row27_col2, #T_56dd5_row27_col3, #T_56dd5_row27_col4, #T_56dd5_row27_col5, #T_56dd5_row27_col6, #T_56dd5_row27_col7, #T_56dd5_row27_col8, #T_56dd5_row28_col0, #T_56dd5_row28_col1, #T_56dd5_row28_col2, #T_56dd5_row28_col3, #T_56dd5_row28_col4, #T_56dd5_row28_col5, #T_56dd5_row28_col6, #T_56dd5_row28_col7, #T_56dd5_row28_col8, #T_56dd5_row29_col0, #T_56dd5_row29_col1, #T_56dd5_row29_col2, #T_56dd5_row29_col3, #T_56dd5_row29_col4, #T_56dd5_row29_col5, #T_56dd5_row29_col6, #T_56dd5_row29_col7, #T_56dd5_row29_col8, #T_56dd5_row30_col0, #T_56dd5_row30_col1, #T_56dd5_row30_col2, #T_56dd5_row30_col3, #T_56dd5_row30_col4, #T_56dd5_row30_col5, #T_56dd5_row30_col6, #T_56dd5_row30_col7, #T_56dd5_row30_col8, #T_56dd5_row31_col0, #T_56dd5_row31_col1, #T_56dd5_row31_col2, #T_56dd5_row31_col3, #T_56dd5_row31_col4, #T_56dd5_row31_col5, #T_56dd5_row31_col6, #T_56dd5_row31_col7, #T_56dd5_row31_col8, #T_56dd5_row32_col0, #T_56dd5_row32_col1, #T_56dd5_row32_col2, #T_56dd5_row32_col3, #T_56dd5_row32_col4, #T_56dd5_row32_col5, #T_56dd5_row32_col6, #T_56dd5_row32_col7, #T_56dd5_row32_col8, #T_56dd5_row33_col0, #T_56dd5_row33_col1, #T_56dd5_row33_col2, #T_56dd5_row33_col3, #T_56dd5_row33_col4, #T_56dd5_row33_col5, #T_56dd5_row33_col6, #T_56dd5_row33_col7, #T_56dd5_row33_col8, #T_56dd5_row34_col0, #T_56dd5_row34_col1, #T_56dd5_row34_col2, #T_56dd5_row34_col3, #T_56dd5_row34_col4, #T_56dd5_row34_col5, #T_56dd5_row34_col6, #T_56dd5_row34_col7, #T_56dd5_row34_col8, #T_56dd5_row35_col0, #T_56dd5_row35_col1, #T_56dd5_row35_col2, #T_56dd5_row35_col3, #T_56dd5_row35_col4, #T_56dd5_row35_col5, #T_56dd5_row35_col6, #T_56dd5_row35_col7, #T_56dd5_row35_col8, #T_56dd5_row36_col0, #T_56dd5_row36_col1, #T_56dd5_row36_col2, #T_56dd5_row36_col3, #T_56dd5_row36_col4, #T_56dd5_row36_col5, #T_56dd5_row36_col6, #T_56dd5_row36_col7, #T_56dd5_row36_col8, #T_56dd5_row37_col0, #T_56dd5_row37_col1, #T_56dd5_row37_col2, #T_56dd5_row37_col3, #T_56dd5_row37_col4, #T_56dd5_row37_col5, #T_56dd5_row37_col6, #T_56dd5_row37_col7, #T_56dd5_row37_col8, #T_56dd5_row38_col0, #T_56dd5_row38_col1, #T_56dd5_row38_col2, #T_56dd5_row38_col3, #T_56dd5_row38_col4, #T_56dd5_row38_col5, #T_56dd5_row38_col6, #T_56dd5_row38_col7, #T_56dd5_row38_col8, #T_56dd5_row39_col0, #T_56dd5_row39_col1, #T_56dd5_row39_col2, #T_56dd5_row39_col3, #T_56dd5_row39_col4, #T_56dd5_row39_col5, #T_56dd5_row39_col6, #T_56dd5_row39_col7, #T_56dd5_row39_col8, #T_56dd5_row40_col0, #T_56dd5_row40_col1, #T_56dd5_row40_col2, #T_56dd5_row40_col3, #T_56dd5_row40_col4, #T_56dd5_row40_col5, #T_56dd5_row40_col6, #T_56dd5_row40_col7, #T_56dd5_row40_col8, #T_56dd5_row41_col0, #T_56dd5_row41_col1, #T_56dd5_row41_col2, #T_56dd5_row41_col3, #T_56dd5_row41_col4, #T_56dd5_row41_col5, #T_56dd5_row41_col6, #T_56dd5_row41_col7, #T_56dd5_row41_col8, #T_56dd5_row42_col0, #T_56dd5_row42_col1, #T_56dd5_row42_col2, #T_56dd5_row42_col3, #T_56dd5_row42_col4, #T_56dd5_row42_col5, #T_56dd5_row42_col6, #T_56dd5_row42_col7, #T_56dd5_row42_col8, #T_56dd5_row43_col0, #T_56dd5_row43_col1, #T_56dd5_row43_col2, #T_56dd5_row43_col3, #T_56dd5_row43_col4, #T_56dd5_row43_col5, #T_56dd5_row43_col6, #T_56dd5_row43_col7, #T_56dd5_row43_col8, #T_56dd5_row44_col0, #T_56dd5_row44_col1, #T_56dd5_row44_col2, #T_56dd5_row44_col3, #T_56dd5_row44_col4, #T_56dd5_row44_col5, #T_56dd5_row44_col6, #T_56dd5_row44_col7, #T_56dd5_row44_col8, #T_56dd5_row45_col0, #T_56dd5_row45_col1, #T_56dd5_row45_col2, #T_56dd5_row45_col3, #T_56dd5_row45_col4, #T_56dd5_row45_col5, #T_56dd5_row45_col6, #T_56dd5_row45_col7, #T_56dd5_row45_col8, #T_56dd5_row46_col0, #T_56dd5_row46_col1, #T_56dd5_row46_col2, #T_56dd5_row46_col3, #T_56dd5_row46_col4, #T_56dd5_row46_col5, #T_56dd5_row46_col6, #T_56dd5_row46_col7, #T_56dd5_row46_col8, #T_56dd5_row47_col0, #T_56dd5_row47_col1, #T_56dd5_row47_col2, #T_56dd5_row47_col3, #T_56dd5_row47_col4, #T_56dd5_row47_col5, #T_56dd5_row47_col6, #T_56dd5_row47_col7, #T_56dd5_row47_col8, #T_56dd5_row48_col0, #T_56dd5_row48_col1, #T_56dd5_row48_col2, #T_56dd5_row48_col3, #T_56dd5_row48_col4, #T_56dd5_row48_col5, #T_56dd5_row48_col6, #T_56dd5_row48_col7, #T_56dd5_row48_col8, #T_56dd5_row49_col0, #T_56dd5_row49_col1, #T_56dd5_row49_col2, #T_56dd5_row49_col3, #T_56dd5_row49_col4, #T_56dd5_row49_col5, #T_56dd5_row49_col6, #T_56dd5_row49_col7, #T_56dd5_row49_col8, #T_56dd5_row50_col0, #T_56dd5_row50_col1, #T_56dd5_row50_col2, #T_56dd5_row50_col3, #T_56dd5_row50_col4, #T_56dd5_row50_col5, #T_56dd5_row50_col6, #T_56dd5_row50_col7, #T_56dd5_row50_col8, #T_56dd5_row51_col0, #T_56dd5_row51_col1, #T_56dd5_row51_col2, #T_56dd5_row51_col3, #T_56dd5_row51_col4, #T_56dd5_row51_col5, #T_56dd5_row51_col6, #T_56dd5_row51_col7, #T_56dd5_row51_col8, #T_56dd5_row52_col0, #T_56dd5_row52_col1, #T_56dd5_row52_col2, #T_56dd5_row52_col3, #T_56dd5_row52_col4, #T_56dd5_row52_col5, #T_56dd5_row52_col6, #T_56dd5_row52_col7, #T_56dd5_row52_col8, #T_56dd5_row53_col0, #T_56dd5_row53_col1, #T_56dd5_row53_col2, #T_56dd5_row53_col3, #T_56dd5_row53_col4, #T_56dd5_row53_col5, #T_56dd5_row53_col6, #T_56dd5_row53_col7, #T_56dd5_row53_col8, #T_56dd5_row54_col0, #T_56dd5_row54_col1, #T_56dd5_row54_col2, #T_56dd5_row54_col3, #T_56dd5_row54_col4, #T_56dd5_row54_col5, #T_56dd5_row54_col6, #T_56dd5_row54_col7, #T_56dd5_row54_col8, #T_56dd5_row55_col0, #T_56dd5_row55_col1, #T_56dd5_row55_col2, #T_56dd5_row55_col3, #T_56dd5_row55_col4, #T_56dd5_row55_col5, #T_56dd5_row55_col6, #T_56dd5_row55_col7, #T_56dd5_row55_col8, #T_56dd5_row56_col0, #T_56dd5_row56_col1, #T_56dd5_row56_col2, #T_56dd5_row56_col3, #T_56dd5_row56_col4, #T_56dd5_row56_col5, #T_56dd5_row56_col6, #T_56dd5_row56_col7, #T_56dd5_row56_col8, #T_56dd5_row57_col0, #T_56dd5_row57_col1, #T_56dd5_row57_col2, #T_56dd5_row57_col3, #T_56dd5_row57_col4, #T_56dd5_row57_col5, #T_56dd5_row57_col6, #T_56dd5_row57_col7, #T_56dd5_row57_col8, #T_56dd5_row58_col0, #T_56dd5_row58_col1, #T_56dd5_row58_col2, #T_56dd5_row58_col3, #T_56dd5_row58_col4, #T_56dd5_row58_col5, #T_56dd5_row58_col6, #T_56dd5_row58_col7, #T_56dd5_row58_col8, #T_56dd5_row59_col0, #T_56dd5_row59_col1, #T_56dd5_row59_col2, #T_56dd5_row59_col3, #T_56dd5_row59_col4, #T_56dd5_row59_col5, #T_56dd5_row59_col6, #T_56dd5_row59_col7, #T_56dd5_row59_col8, #T_56dd5_row60_col0, #T_56dd5_row60_col1, #T_56dd5_row60_col2, #T_56dd5_row60_col3, #T_56dd5_row60_col4, #T_56dd5_row60_col5, #T_56dd5_row60_col6, #T_56dd5_row60_col7, #T_56dd5_row60_col8, #T_56dd5_row61_col0, #T_56dd5_row61_col1, #T_56dd5_row61_col2, #T_56dd5_row61_col3, #T_56dd5_row61_col4, #T_56dd5_row61_col5, #T_56dd5_row61_col6, #T_56dd5_row61_col7, #T_56dd5_row61_col8, #T_56dd5_row62_col0, #T_56dd5_row62_col1, #T_56dd5_row62_col2, #T_56dd5_row62_col3, #T_56dd5_row62_col4, #T_56dd5_row62_col5, #T_56dd5_row62_col6, #T_56dd5_row62_col7, #T_56dd5_row62_col8, #T_56dd5_row63_col0, #T_56dd5_row63_col1, #T_56dd5_row63_col2, #T_56dd5_row63_col3, #T_56dd5_row63_col4, #T_56dd5_row63_col5, #T_56dd5_row63_col6, #T_56dd5_row63_col7, #T_56dd5_row63_col8, #T_56dd5_row64_col0, #T_56dd5_row64_col1, #T_56dd5_row64_col2, #T_56dd5_row64_col3, #T_56dd5_row64_col4, #T_56dd5_row64_col5, #T_56dd5_row64_col6, #T_56dd5_row64_col7, #T_56dd5_row64_col8, #T_56dd5_row65_col0, #T_56dd5_row65_col1, #T_56dd5_row65_col2, #T_56dd5_row65_col3, #T_56dd5_row65_col4, #T_56dd5_row65_col5, #T_56dd5_row65_col6, #T_56dd5_row65_col7, #T_56dd5_row65_col8, #T_56dd5_row66_col0, #T_56dd5_row66_col1, #T_56dd5_row66_col2, #T_56dd5_row66_col3, #T_56dd5_row66_col4, #T_56dd5_row66_col5, #T_56dd5_row66_col6, #T_56dd5_row66_col7, #T_56dd5_row66_col8, #T_56dd5_row67_col0, #T_56dd5_row67_col1, #T_56dd5_row67_col2, #T_56dd5_row67_col3, #T_56dd5_row67_col4, #T_56dd5_row67_col5, #T_56dd5_row67_col6, #T_56dd5_row67_col7, #T_56dd5_row67_col8, #T_56dd5_row68_col0, #T_56dd5_row68_col1, #T_56dd5_row68_col2, #T_56dd5_row68_col3, #T_56dd5_row68_col4, #T_56dd5_row68_col5, #T_56dd5_row68_col6, #T_56dd5_row68_col7, #T_56dd5_row68_col8, #T_56dd5_row69_col0, #T_56dd5_row69_col1, #T_56dd5_row69_col2, #T_56dd5_row69_col3, #T_56dd5_row69_col4, #T_56dd5_row69_col5, #T_56dd5_row69_col6, #T_56dd5_row69_col7, #T_56dd5_row69_col8, #T_56dd5_row70_col0, #T_56dd5_row70_col1, #T_56dd5_row70_col2, #T_56dd5_row70_col3, #T_56dd5_row70_col4, #T_56dd5_row70_col5, #T_56dd5_row70_col6, #T_56dd5_row70_col7, #T_56dd5_row70_col8, #T_56dd5_row71_col0, #T_56dd5_row71_col1, #T_56dd5_row71_col2, #T_56dd5_row71_col3, #T_56dd5_row71_col4, #T_56dd5_row71_col5, #T_56dd5_row71_col6, #T_56dd5_row71_col7, #T_56dd5_row71_col8, #T_56dd5_row72_col0, #T_56dd5_row72_col1, #T_56dd5_row72_col2, #T_56dd5_row72_col3, #T_56dd5_row72_col4, #T_56dd5_row72_col5, #T_56dd5_row72_col6, #T_56dd5_row72_col7, #T_56dd5_row72_col8, #T_56dd5_row73_col0, #T_56dd5_row73_col1, #T_56dd5_row73_col2, #T_56dd5_row73_col3, #T_56dd5_row73_col4, #T_56dd5_row73_col5, #T_56dd5_row73_col6, #T_56dd5_row73_col7, #T_56dd5_row73_col8, #T_56dd5_row74_col0, #T_56dd5_row74_col1, #T_56dd5_row74_col2, #T_56dd5_row74_col3, #T_56dd5_row74_col4, #T_56dd5_row74_col5, #T_56dd5_row74_col6, #T_56dd5_row74_col7, #T_56dd5_row74_col8, #T_56dd5_row75_col0, #T_56dd5_row75_col1, #T_56dd5_row75_col2, #T_56dd5_row75_col3, #T_56dd5_row75_col4, #T_56dd5_row75_col5, #T_56dd5_row75_col6, #T_56dd5_row75_col7, #T_56dd5_row75_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_56dd5\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_56dd5_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_56dd5_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_56dd5_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_56dd5_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_56dd5_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_56dd5_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_56dd5_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_56dd5_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_56dd5_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_56dd5_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", - " <td id=\"T_56dd5_row0_col1\" class=\"data row0 col1\" >Bivariate Scatter Plots</td>\n", - " <td id=\"T_56dd5_row0_col2\" class=\"data row0 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", - " <td id=\"T_56dd5_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_56dd5_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_56dd5_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row0_col7\" class=\"data row0 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row0_col8\" class=\"data row0 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", - " <td id=\"T_56dd5_row1_col1\" class=\"data row1 col1\" >Chi Squared Features Table</td>\n", - " <td id=\"T_56dd5_row1_col2\" class=\"data row1 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", - " <td id=\"T_56dd5_row1_col3\" class=\"data row1 col3\" >False</td>\n", - " <td id=\"T_56dd5_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_56dd5_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row1_col6\" class=\"data row1 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", - " <td id=\"T_56dd5_row1_col7\" class=\"data row1 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", - " <td id=\"T_56dd5_row1_col8\" class=\"data row1 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_56dd5_row2_col1\" class=\"data row2 col1\" >Class Imbalance</td>\n", - " <td id=\"T_56dd5_row2_col2\" class=\"data row2 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", - " <td id=\"T_56dd5_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_56dd5_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_56dd5_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row2_col6\" class=\"data row2 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row2_col7\" class=\"data row2 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row2_col8\" class=\"data row2 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_56dd5_row3_col1\" class=\"data row3 col1\" >Dataset Description</td>\n", - " <td id=\"T_56dd5_row3_col2\" class=\"data row3 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", - " <td id=\"T_56dd5_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_56dd5_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_56dd5_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row3_col7\" class=\"data row3 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_56dd5_row3_col8\" class=\"data row3 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_56dd5_row4_col1\" class=\"data row4 col1\" >Dataset Split</td>\n", - " <td id=\"T_56dd5_row4_col2\" class=\"data row4 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", - " <td id=\"T_56dd5_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_56dd5_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_56dd5_row4_col5\" class=\"data row4 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row4_col6\" class=\"data row4 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row4_col7\" class=\"data row4 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_56dd5_row4_col8\" class=\"data row4 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_56dd5_row5_col1\" class=\"data row5 col1\" >Descriptive Statistics</td>\n", - " <td id=\"T_56dd5_row5_col2\" class=\"data row5 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", - " <td id=\"T_56dd5_row5_col3\" class=\"data row5 col3\" >False</td>\n", - " <td id=\"T_56dd5_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_56dd5_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row5_col8\" class=\"data row5 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_56dd5_row6_col1\" class=\"data row6 col1\" >Duplicates</td>\n", - " <td id=\"T_56dd5_row6_col2\" class=\"data row6 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", - " <td id=\"T_56dd5_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_56dd5_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_56dd5_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row6_col6\" class=\"data row6 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row6_col7\" class=\"data row6 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", - " <td id=\"T_56dd5_row6_col8\" class=\"data row6 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", - " <td id=\"T_56dd5_row7_col1\" class=\"data row7 col1\" >Feature Target Correlation Plot</td>\n", - " <td id=\"T_56dd5_row7_col2\" class=\"data row7 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", - " <td id=\"T_56dd5_row7_col3\" class=\"data row7 col3\" >True</td>\n", - " <td id=\"T_56dd5_row7_col4\" class=\"data row7 col4\" >False</td>\n", - " <td id=\"T_56dd5_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row7_col6\" class=\"data row7 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", - " <td id=\"T_56dd5_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", - " <td id=\"T_56dd5_row7_col8\" class=\"data row7 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_56dd5_row8_col1\" class=\"data row8 col1\" >High Cardinality</td>\n", - " <td id=\"T_56dd5_row8_col2\" class=\"data row8 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", - " <td id=\"T_56dd5_row8_col3\" class=\"data row8 col3\" >False</td>\n", - " <td id=\"T_56dd5_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_56dd5_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row8_col6\" class=\"data row8 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", - " <td id=\"T_56dd5_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row8_col8\" class=\"data row8 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_56dd5_row9_col1\" class=\"data row9 col1\" >High Pearson Correlation</td>\n", - " <td id=\"T_56dd5_row9_col2\" class=\"data row9 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", - " <td id=\"T_56dd5_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_56dd5_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_56dd5_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row9_col6\" class=\"data row9 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", - " <td id=\"T_56dd5_row9_col8\" class=\"data row9 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", - " <td id=\"T_56dd5_row10_col1\" class=\"data row10 col1\" >IQR Outliers Bar Plot</td>\n", - " <td id=\"T_56dd5_row10_col2\" class=\"data row10 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", - " <td id=\"T_56dd5_row10_col3\" class=\"data row10 col3\" >True</td>\n", - " <td id=\"T_56dd5_row10_col4\" class=\"data row10 col4\" >False</td>\n", - " <td id=\"T_56dd5_row10_col5\" class=\"data row10 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", - " <td id=\"T_56dd5_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", - " <td id=\"T_56dd5_row10_col8\" class=\"data row10 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.IQROutliersTable</td>\n", - " <td id=\"T_56dd5_row11_col1\" class=\"data row11 col1\" >IQR Outliers Table</td>\n", - " <td id=\"T_56dd5_row11_col2\" class=\"data row11 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", - " <td id=\"T_56dd5_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_56dd5_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_56dd5_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row11_col6\" class=\"data row11 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", - " <td id=\"T_56dd5_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'numerical_data']</td>\n", - " <td id=\"T_56dd5_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", - " <td id=\"T_56dd5_row12_col1\" class=\"data row12 col1\" >Isolation Forest Outliers</td>\n", - " <td id=\"T_56dd5_row12_col2\" class=\"data row12 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", - " <td id=\"T_56dd5_row12_col3\" class=\"data row12 col3\" >True</td>\n", - " <td id=\"T_56dd5_row12_col4\" class=\"data row12 col4\" >False</td>\n", - " <td id=\"T_56dd5_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row12_col6\" class=\"data row12 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row12_col7\" class=\"data row12 col7\" >['tabular_data', 'anomaly_detection']</td>\n", - " <td id=\"T_56dd5_row12_col8\" class=\"data row12 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.JarqueBera</td>\n", - " <td id=\"T_56dd5_row13_col1\" class=\"data row13 col1\" >Jarque Bera</td>\n", - " <td id=\"T_56dd5_row13_col2\" class=\"data row13 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", - " <td id=\"T_56dd5_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_56dd5_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_56dd5_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row13_col6\" class=\"data row13 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_56dd5_row14_col1\" class=\"data row14 col1\" >Missing Values</td>\n", - " <td id=\"T_56dd5_row14_col2\" class=\"data row14 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", - " <td id=\"T_56dd5_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_56dd5_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_56dd5_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row14_col6\" class=\"data row14 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row14_col7\" class=\"data row14 col7\" >['tabular_data', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row14_col8\" class=\"data row14 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", - " <td id=\"T_56dd5_row15_col1\" class=\"data row15 col1\" >Missing Values Bar Plot</td>\n", - " <td id=\"T_56dd5_row15_col2\" class=\"data row15 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", - " <td id=\"T_56dd5_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_56dd5_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_56dd5_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row15_col6\" class=\"data row15 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", - " <td id=\"T_56dd5_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", - " <td id=\"T_56dd5_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.MutualInformation</td>\n", - " <td id=\"T_56dd5_row16_col1\" class=\"data row16 col1\" >Mutual Information</td>\n", - " <td id=\"T_56dd5_row16_col2\" class=\"data row16 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", - " <td id=\"T_56dd5_row16_col3\" class=\"data row16 col3\" >True</td>\n", - " <td id=\"T_56dd5_row16_col4\" class=\"data row16 col4\" >False</td>\n", - " <td id=\"T_56dd5_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", - " <td id=\"T_56dd5_row16_col7\" class=\"data row16 col7\" >['feature_selection', 'data_analysis']</td>\n", - " <td id=\"T_56dd5_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_56dd5_row17_col1\" class=\"data row17 col1\" >Pearson Correlation Matrix</td>\n", - " <td id=\"T_56dd5_row17_col2\" class=\"data row17 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", - " <td id=\"T_56dd5_row17_col3\" class=\"data row17 col3\" >True</td>\n", - " <td id=\"T_56dd5_row17_col4\" class=\"data row17 col4\" >False</td>\n", - " <td id=\"T_56dd5_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row17_col6\" class=\"data row17 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", - " <td id=\"T_56dd5_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", - " <td id=\"T_56dd5_row18_col1\" class=\"data row18 col1\" >Protected Classes Description</td>\n", - " <td id=\"T_56dd5_row18_col2\" class=\"data row18 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", - " <td id=\"T_56dd5_row18_col3\" class=\"data row18 col3\" >True</td>\n", - " <td id=\"T_56dd5_row18_col4\" class=\"data row18 col4\" >True</td>\n", - " <td id=\"T_56dd5_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row18_col6\" class=\"data row18 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row18_col7\" class=\"data row18 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", - " <td id=\"T_56dd5_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.RunsTest</td>\n", - " <td id=\"T_56dd5_row19_col1\" class=\"data row19 col1\" >Runs Test</td>\n", - " <td id=\"T_56dd5_row19_col2\" class=\"data row19 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", - " <td id=\"T_56dd5_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_56dd5_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_56dd5_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row19_col6\" class=\"data row19 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.ScatterPlot</td>\n", - " <td id=\"T_56dd5_row20_col1\" class=\"data row20 col1\" >Scatter Plot</td>\n", - " <td id=\"T_56dd5_row20_col2\" class=\"data row20 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", - " <td id=\"T_56dd5_row20_col3\" class=\"data row20 col3\" >True</td>\n", - " <td id=\"T_56dd5_row20_col4\" class=\"data row20 col4\" >False</td>\n", - " <td id=\"T_56dd5_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row20_col6\" class=\"data row20 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row20_col8\" class=\"data row20 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", - " <td id=\"T_56dd5_row21_col1\" class=\"data row21 col1\" >Score Band Default Rates</td>\n", - " <td id=\"T_56dd5_row21_col2\" class=\"data row21 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", - " <td id=\"T_56dd5_row21_col3\" class=\"data row21 col3\" >False</td>\n", - " <td id=\"T_56dd5_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_56dd5_row21_col5\" class=\"data row21 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row21_col6\" class=\"data row21 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row21_col7\" class=\"data row21 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_56dd5_row21_col8\" class=\"data row21 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.ShapiroWilk</td>\n", - " <td id=\"T_56dd5_row22_col1\" class=\"data row22 col1\" >Shapiro Wilk</td>\n", - " <td id=\"T_56dd5_row22_col2\" class=\"data row22 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", - " <td id=\"T_56dd5_row22_col3\" class=\"data row22 col3\" >False</td>\n", - " <td id=\"T_56dd5_row22_col4\" class=\"data row22 col4\" >True</td>\n", - " <td id=\"T_56dd5_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row22_col6\" class=\"data row22 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row22_col7\" class=\"data row22 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", - " <td id=\"T_56dd5_row22_col8\" class=\"data row22 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_56dd5_row23_col1\" class=\"data row23 col1\" >Skewness</td>\n", - " <td id=\"T_56dd5_row23_col2\" class=\"data row23 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", - " <td id=\"T_56dd5_row23_col3\" class=\"data row23 col3\" >False</td>\n", - " <td id=\"T_56dd5_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_56dd5_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row23_col6\" class=\"data row23 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row23_col7\" class=\"data row23 col7\" >['data_quality', 'tabular_data']</td>\n", - " <td id=\"T_56dd5_row23_col8\" class=\"data row23 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", - " <td id=\"T_56dd5_row24_col1\" class=\"data row24 col1\" >Tabular Categorical Bar Plots</td>\n", - " <td id=\"T_56dd5_row24_col2\" class=\"data row24 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", - " <td id=\"T_56dd5_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_56dd5_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_56dd5_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row24_col6\" class=\"data row24 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row24_col7\" class=\"data row24 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row24_col8\" class=\"data row24 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", - " <td id=\"T_56dd5_row25_col1\" class=\"data row25 col1\" >Tabular Date Time Histograms</td>\n", - " <td id=\"T_56dd5_row25_col2\" class=\"data row25 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", - " <td id=\"T_56dd5_row25_col3\" class=\"data row25 col3\" >True</td>\n", - " <td id=\"T_56dd5_row25_col4\" class=\"data row25 col4\" >False</td>\n", - " <td id=\"T_56dd5_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row25_col6\" class=\"data row25 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row25_col7\" class=\"data row25 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", - " <td id=\"T_56dd5_row26_col1\" class=\"data row26 col1\" >Tabular Description Tables</td>\n", - " <td id=\"T_56dd5_row26_col2\" class=\"data row26 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", - " <td id=\"T_56dd5_row26_col3\" class=\"data row26 col3\" >False</td>\n", - " <td id=\"T_56dd5_row26_col4\" class=\"data row26 col4\" >True</td>\n", - " <td id=\"T_56dd5_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row26_col6\" class=\"data row26 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row26_col7\" class=\"data row26 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", - " <td id=\"T_56dd5_row27_col1\" class=\"data row27 col1\" >Tabular Numerical Histograms</td>\n", - " <td id=\"T_56dd5_row27_col2\" class=\"data row27 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", - " <td id=\"T_56dd5_row27_col3\" class=\"data row27 col3\" >True</td>\n", - " <td id=\"T_56dd5_row27_col4\" class=\"data row27 col4\" >False</td>\n", - " <td id=\"T_56dd5_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row27_col6\" class=\"data row27 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row27_col7\" class=\"data row27 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", - " <td id=\"T_56dd5_row28_col1\" class=\"data row28 col1\" >Target Rate Bar Plots</td>\n", - " <td id=\"T_56dd5_row28_col2\" class=\"data row28 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", - " <td id=\"T_56dd5_row28_col3\" class=\"data row28 col3\" >True</td>\n", - " <td id=\"T_56dd5_row28_col4\" class=\"data row28 col4\" >False</td>\n", - " <td id=\"T_56dd5_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row28_col8\" class=\"data row28 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_56dd5_row29_col1\" class=\"data row29 col1\" >Too Many Zero Values</td>\n", - " <td id=\"T_56dd5_row29_col2\" class=\"data row29 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", - " <td id=\"T_56dd5_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_56dd5_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_56dd5_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row29_col6\" class=\"data row29 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", - " <td id=\"T_56dd5_row29_col7\" class=\"data row29 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row29_col8\" class=\"data row29 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_56dd5_row30_col1\" class=\"data row30 col1\" >Unique Rows</td>\n", - " <td id=\"T_56dd5_row30_col2\" class=\"data row30 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", - " <td id=\"T_56dd5_row30_col3\" class=\"data row30 col3\" >False</td>\n", - " <td id=\"T_56dd5_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_56dd5_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row30_col6\" class=\"data row30 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row30_col7\" class=\"data row30 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row30_col8\" class=\"data row30 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.WOEBinPlots</td>\n", - " <td id=\"T_56dd5_row31_col1\" class=\"data row31 col1\" >WOE Bin Plots</td>\n", - " <td id=\"T_56dd5_row31_col2\" class=\"data row31 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", - " <td id=\"T_56dd5_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_56dd5_row31_col4\" class=\"data row31 col4\" >False</td>\n", - " <td id=\"T_56dd5_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row31_col6\" class=\"data row31 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_56dd5_row31_col7\" class=\"data row31 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row31_col8\" class=\"data row31 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.WOEBinTable</td>\n", - " <td id=\"T_56dd5_row32_col1\" class=\"data row32 col1\" >WOE Bin Table</td>\n", - " <td id=\"T_56dd5_row32_col2\" class=\"data row32 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", - " <td id=\"T_56dd5_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_56dd5_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_56dd5_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row32_col6\" class=\"data row32 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row32_col8\" class=\"data row32 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.FeaturesAUC</td>\n", - " <td id=\"T_56dd5_row33_col1\" class=\"data row33 col1\" >Features AUC</td>\n", - " <td id=\"T_56dd5_row33_col2\" class=\"data row33 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", - " <td id=\"T_56dd5_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_56dd5_row33_col4\" class=\"data row33 col4\" >False</td>\n", - " <td id=\"T_56dd5_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row33_col6\" class=\"data row33 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_56dd5_row33_col7\" class=\"data row33 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", - " <td id=\"T_56dd5_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_56dd5_row34_col1\" class=\"data row34 col1\" >Calibration Curve</td>\n", - " <td id=\"T_56dd5_row34_col2\" class=\"data row34 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_56dd5_row34_col3\" class=\"data row34 col3\" >True</td>\n", - " <td id=\"T_56dd5_row34_col4\" class=\"data row34 col4\" >False</td>\n", - " <td id=\"T_56dd5_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row34_col6\" class=\"data row34 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_56dd5_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_56dd5_row35_col1\" class=\"data row35 col1\" >Classifier Performance</td>\n", - " <td id=\"T_56dd5_row35_col2\" class=\"data row35 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_56dd5_row35_col3\" class=\"data row35 col3\" >False</td>\n", - " <td id=\"T_56dd5_row35_col4\" class=\"data row35 col4\" >True</td>\n", - " <td id=\"T_56dd5_row35_col5\" class=\"data row35 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row35_col6\" class=\"data row35 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_56dd5_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_56dd5_row36_col1\" class=\"data row36 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_56dd5_row36_col2\" class=\"data row36 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_56dd5_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_56dd5_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_56dd5_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row36_col6\" class=\"data row36 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row36_col7\" class=\"data row36 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_56dd5_row36_col8\" class=\"data row36 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_56dd5_row37_col1\" class=\"data row37 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_56dd5_row37_col2\" class=\"data row37 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_56dd5_row37_col3\" class=\"data row37 col3\" >True</td>\n", - " <td id=\"T_56dd5_row37_col4\" class=\"data row37 col4\" >False</td>\n", - " <td id=\"T_56dd5_row37_col5\" class=\"data row37 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row37_col6\" class=\"data row37 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row38_col0\" class=\"data row38 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_56dd5_row38_col1\" class=\"data row38 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_56dd5_row38_col2\" class=\"data row38 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_56dd5_row38_col3\" class=\"data row38 col3\" >False</td>\n", - " <td id=\"T_56dd5_row38_col4\" class=\"data row38 col4\" >True</td>\n", - " <td id=\"T_56dd5_row38_col5\" class=\"data row38 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row38_col6\" class=\"data row38 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row38_col7\" class=\"data row38 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row38_col8\" class=\"data row38 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row39_col0\" class=\"data row39 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_56dd5_row39_col1\" class=\"data row39 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_56dd5_row39_col2\" class=\"data row39 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_56dd5_row39_col3\" class=\"data row39 col3\" >False</td>\n", - " <td id=\"T_56dd5_row39_col4\" class=\"data row39 col4\" >True</td>\n", - " <td id=\"T_56dd5_row39_col5\" class=\"data row39 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row39_col6\" class=\"data row39 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_56dd5_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row40_col0\" class=\"data row40 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_56dd5_row40_col1\" class=\"data row40 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_56dd5_row40_col2\" class=\"data row40 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_56dd5_row40_col3\" class=\"data row40 col3\" >False</td>\n", - " <td id=\"T_56dd5_row40_col4\" class=\"data row40 col4\" >True</td>\n", - " <td id=\"T_56dd5_row40_col5\" class=\"data row40 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row40_col6\" class=\"data row40 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row41_col0\" class=\"data row41 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_56dd5_row41_col1\" class=\"data row41 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_56dd5_row41_col2\" class=\"data row41 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_56dd5_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_56dd5_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_56dd5_row41_col5\" class=\"data row41 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row41_col6\" class=\"data row41 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row42_col0\" class=\"data row42 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_56dd5_row42_col1\" class=\"data row42 col1\" >Model Parameters</td>\n", - " <td id=\"T_56dd5_row42_col2\" class=\"data row42 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_56dd5_row42_col3\" class=\"data row42 col3\" >False</td>\n", - " <td id=\"T_56dd5_row42_col4\" class=\"data row42 col4\" >True</td>\n", - " <td id=\"T_56dd5_row42_col5\" class=\"data row42 col5\" >['model']</td>\n", - " <td id=\"T_56dd5_row42_col6\" class=\"data row42 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row42_col7\" class=\"data row42 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_56dd5_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row43_col0\" class=\"data row43 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_56dd5_row43_col1\" class=\"data row43 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_56dd5_row43_col2\" class=\"data row43 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_56dd5_row43_col3\" class=\"data row43 col3\" >False</td>\n", - " <td id=\"T_56dd5_row43_col4\" class=\"data row43 col4\" >True</td>\n", - " <td id=\"T_56dd5_row43_col5\" class=\"data row43 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_56dd5_row43_col6\" class=\"data row43 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row43_col7\" class=\"data row43 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_56dd5_row43_col8\" class=\"data row43 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row44_col0\" class=\"data row44 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_56dd5_row44_col1\" class=\"data row44 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_56dd5_row44_col2\" class=\"data row44 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_56dd5_row44_col3\" class=\"data row44 col3\" >True</td>\n", - " <td id=\"T_56dd5_row44_col4\" class=\"data row44 col4\" >True</td>\n", - " <td id=\"T_56dd5_row44_col5\" class=\"data row44 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_56dd5_row44_col6\" class=\"data row44 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_56dd5_row44_col7\" class=\"data row44 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_56dd5_row44_col8\" class=\"data row44 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row45_col0\" class=\"data row45 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_56dd5_row45_col1\" class=\"data row45 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_56dd5_row45_col2\" class=\"data row45 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_56dd5_row45_col3\" class=\"data row45 col3\" >True</td>\n", - " <td id=\"T_56dd5_row45_col4\" class=\"data row45 col4\" >False</td>\n", - " <td id=\"T_56dd5_row45_col5\" class=\"data row45 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row45_col6\" class=\"data row45 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row45_col7\" class=\"data row45 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row45_col8\" class=\"data row45 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row46_col0\" class=\"data row46 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_56dd5_row46_col1\" class=\"data row46 col1\" >Population Stability Index</td>\n", - " <td id=\"T_56dd5_row46_col2\" class=\"data row46 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_56dd5_row46_col3\" class=\"data row46 col3\" >True</td>\n", - " <td id=\"T_56dd5_row46_col4\" class=\"data row46 col4\" >True</td>\n", - " <td id=\"T_56dd5_row46_col5\" class=\"data row46 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row46_col6\" class=\"data row46 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_56dd5_row46_col7\" class=\"data row46 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row46_col8\" class=\"data row46 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row47_col0\" class=\"data row47 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_56dd5_row47_col1\" class=\"data row47 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_56dd5_row47_col2\" class=\"data row47 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_56dd5_row47_col3\" class=\"data row47 col3\" >True</td>\n", - " <td id=\"T_56dd5_row47_col4\" class=\"data row47 col4\" >False</td>\n", - " <td id=\"T_56dd5_row47_col5\" class=\"data row47 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row47_col6\" class=\"data row47 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row47_col7\" class=\"data row47 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row47_col8\" class=\"data row47 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row48_col0\" class=\"data row48 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_56dd5_row48_col1\" class=\"data row48 col1\" >ROC Curve</td>\n", - " <td id=\"T_56dd5_row48_col2\" class=\"data row48 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_56dd5_row48_col3\" class=\"data row48 col3\" >True</td>\n", - " <td id=\"T_56dd5_row48_col4\" class=\"data row48 col4\" >False</td>\n", - " <td id=\"T_56dd5_row48_col5\" class=\"data row48 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row48_col6\" class=\"data row48 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row48_col7\" class=\"data row48 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row48_col8\" class=\"data row48 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row49_col0\" class=\"data row49 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_56dd5_row49_col1\" class=\"data row49 col1\" >Regression Errors</td>\n", - " <td id=\"T_56dd5_row49_col2\" class=\"data row49 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_56dd5_row49_col3\" class=\"data row49 col3\" >False</td>\n", - " <td id=\"T_56dd5_row49_col4\" class=\"data row49 col4\" >True</td>\n", - " <td id=\"T_56dd5_row49_col5\" class=\"data row49 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row49_col6\" class=\"data row49 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row49_col7\" class=\"data row49 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row49_col8\" class=\"data row49 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row50_col0\" class=\"data row50 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_56dd5_row50_col1\" class=\"data row50 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_56dd5_row50_col2\" class=\"data row50 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_56dd5_row50_col3\" class=\"data row50 col3\" >True</td>\n", - " <td id=\"T_56dd5_row50_col4\" class=\"data row50 col4\" >True</td>\n", - " <td id=\"T_56dd5_row50_col5\" class=\"data row50 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row50_col6\" class=\"data row50 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_56dd5_row50_col7\" class=\"data row50 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_56dd5_row50_col8\" class=\"data row50 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row51_col0\" class=\"data row51 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_56dd5_row51_col1\" class=\"data row51 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_56dd5_row51_col2\" class=\"data row51 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_56dd5_row51_col3\" class=\"data row51 col3\" >False</td>\n", - " <td id=\"T_56dd5_row51_col4\" class=\"data row51 col4\" >True</td>\n", - " <td id=\"T_56dd5_row51_col5\" class=\"data row51 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row51_col6\" class=\"data row51 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row51_col7\" class=\"data row51 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row51_col8\" class=\"data row51 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row52_col0\" class=\"data row52 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_56dd5_row52_col1\" class=\"data row52 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_56dd5_row52_col2\" class=\"data row52 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_56dd5_row52_col3\" class=\"data row52 col3\" >True</td>\n", - " <td id=\"T_56dd5_row52_col4\" class=\"data row52 col4\" >True</td>\n", - " <td id=\"T_56dd5_row52_col5\" class=\"data row52 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row52_col6\" class=\"data row52 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row52_col7\" class=\"data row52 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_56dd5_row52_col8\" class=\"data row52 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row53_col0\" class=\"data row53 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_56dd5_row53_col1\" class=\"data row53 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_56dd5_row53_col2\" class=\"data row53 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_56dd5_row53_col3\" class=\"data row53 col3\" >False</td>\n", - " <td id=\"T_56dd5_row53_col4\" class=\"data row53 col4\" >True</td>\n", - " <td id=\"T_56dd5_row53_col5\" class=\"data row53 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row53_col6\" class=\"data row53 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_56dd5_row53_col7\" class=\"data row53 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row53_col8\" class=\"data row53 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row54_col0\" class=\"data row54 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_56dd5_row54_col1\" class=\"data row54 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_56dd5_row54_col2\" class=\"data row54 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_56dd5_row54_col3\" class=\"data row54 col3\" >True</td>\n", - " <td id=\"T_56dd5_row54_col4\" class=\"data row54 col4\" >True</td>\n", - " <td id=\"T_56dd5_row54_col5\" class=\"data row54 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row54_col6\" class=\"data row54 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row54_col7\" class=\"data row54 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_56dd5_row54_col8\" class=\"data row54 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row55_col0\" class=\"data row55 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", - " <td id=\"T_56dd5_row55_col1\" class=\"data row55 col1\" >Cumulative Prediction Probabilities</td>\n", - " <td id=\"T_56dd5_row55_col2\" class=\"data row55 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", - " <td id=\"T_56dd5_row55_col3\" class=\"data row55 col3\" >True</td>\n", - " <td id=\"T_56dd5_row55_col4\" class=\"data row55 col4\" >False</td>\n", - " <td id=\"T_56dd5_row55_col5\" class=\"data row55 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row55_col6\" class=\"data row55 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", - " <td id=\"T_56dd5_row55_col7\" class=\"data row55 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row55_col8\" class=\"data row55 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row56_col0\" class=\"data row56 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", - " <td id=\"T_56dd5_row56_col1\" class=\"data row56 col1\" >GINI Table</td>\n", - " <td id=\"T_56dd5_row56_col2\" class=\"data row56 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", - " <td id=\"T_56dd5_row56_col3\" class=\"data row56 col3\" >False</td>\n", - " <td id=\"T_56dd5_row56_col4\" class=\"data row56 col4\" >True</td>\n", - " <td id=\"T_56dd5_row56_col5\" class=\"data row56 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row56_col6\" class=\"data row56 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row56_col7\" class=\"data row56 col7\" >['model_performance']</td>\n", - " <td id=\"T_56dd5_row56_col8\" class=\"data row56 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row57_col0\" class=\"data row57 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", - " <td id=\"T_56dd5_row57_col1\" class=\"data row57 col1\" >Kolmogorov Smirnov</td>\n", - " <td id=\"T_56dd5_row57_col2\" class=\"data row57 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", - " <td id=\"T_56dd5_row57_col3\" class=\"data row57 col3\" >False</td>\n", - " <td id=\"T_56dd5_row57_col4\" class=\"data row57 col4\" >True</td>\n", - " <td id=\"T_56dd5_row57_col5\" class=\"data row57 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row57_col6\" class=\"data row57 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", - " <td id=\"T_56dd5_row57_col7\" class=\"data row57 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row57_col8\" class=\"data row57 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row58_col0\" class=\"data row58 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", - " <td id=\"T_56dd5_row58_col1\" class=\"data row58 col1\" >Lilliefors</td>\n", - " <td id=\"T_56dd5_row58_col2\" class=\"data row58 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", - " <td id=\"T_56dd5_row58_col3\" class=\"data row58 col3\" >False</td>\n", - " <td id=\"T_56dd5_row58_col4\" class=\"data row58 col4\" >True</td>\n", - " <td id=\"T_56dd5_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row58_col6\" class=\"data row58 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row58_col7\" class=\"data row58 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row58_col8\" class=\"data row58 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row59_col0\" class=\"data row59 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", - " <td id=\"T_56dd5_row59_col1\" class=\"data row59 col1\" >Prediction Probabilities Histogram</td>\n", - " <td id=\"T_56dd5_row59_col2\" class=\"data row59 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", - " <td id=\"T_56dd5_row59_col3\" class=\"data row59 col3\" >True</td>\n", - " <td id=\"T_56dd5_row59_col4\" class=\"data row59 col4\" >False</td>\n", - " <td id=\"T_56dd5_row59_col5\" class=\"data row59 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row59_col6\" class=\"data row59 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", - " <td id=\"T_56dd5_row59_col7\" class=\"data row59 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row59_col8\" class=\"data row59 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row60_col0\" class=\"data row60 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", - " <td id=\"T_56dd5_row60_col1\" class=\"data row60 col1\" >Scorecard Histogram</td>\n", - " <td id=\"T_56dd5_row60_col2\" class=\"data row60 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", - " <td id=\"T_56dd5_row60_col3\" class=\"data row60 col3\" >True</td>\n", - " <td id=\"T_56dd5_row60_col4\" class=\"data row60 col4\" >False</td>\n", - " <td id=\"T_56dd5_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row60_col6\" class=\"data row60 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", - " <td id=\"T_56dd5_row60_col7\" class=\"data row60 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_56dd5_row60_col8\" class=\"data row60 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row61_col0\" class=\"data row61 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_56dd5_row61_col1\" class=\"data row61 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_56dd5_row61_col2\" class=\"data row61 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row61_col3\" class=\"data row61 col3\" >True</td>\n", - " <td id=\"T_56dd5_row61_col4\" class=\"data row61 col4\" >True</td>\n", - " <td id=\"T_56dd5_row61_col5\" class=\"data row61 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row61_col6\" class=\"data row61 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row61_col7\" class=\"data row61 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row61_col8\" class=\"data row61 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row62_col0\" class=\"data row62 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_56dd5_row62_col1\" class=\"data row62 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_56dd5_row62_col2\" class=\"data row62 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row62_col3\" class=\"data row62 col3\" >False</td>\n", - " <td id=\"T_56dd5_row62_col4\" class=\"data row62 col4\" >True</td>\n", - " <td id=\"T_56dd5_row62_col5\" class=\"data row62 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row62_col6\" class=\"data row62 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row62_col7\" class=\"data row62 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row62_col8\" class=\"data row62 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row63_col0\" class=\"data row63 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", - " <td id=\"T_56dd5_row63_col1\" class=\"data row63 col1\" >Class Imbalance Drift</td>\n", - " <td id=\"T_56dd5_row63_col2\" class=\"data row63 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row63_col3\" class=\"data row63 col3\" >True</td>\n", - " <td id=\"T_56dd5_row63_col4\" class=\"data row63 col4\" >True</td>\n", - " <td id=\"T_56dd5_row63_col5\" class=\"data row63 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row63_col6\" class=\"data row63 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", - " <td id=\"T_56dd5_row63_col7\" class=\"data row63 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", - " <td id=\"T_56dd5_row63_col8\" class=\"data row63 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row64_col0\" class=\"data row64 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_56dd5_row64_col1\" class=\"data row64 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_56dd5_row64_col2\" class=\"data row64 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row64_col3\" class=\"data row64 col3\" >False</td>\n", - " <td id=\"T_56dd5_row64_col4\" class=\"data row64 col4\" >True</td>\n", - " <td id=\"T_56dd5_row64_col5\" class=\"data row64 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row64_col6\" class=\"data row64 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row64_col7\" class=\"data row64 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row64_col8\" class=\"data row64 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row65_col0\" class=\"data row65 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_56dd5_row65_col1\" class=\"data row65 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_56dd5_row65_col2\" class=\"data row65 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row65_col3\" class=\"data row65 col3\" >False</td>\n", - " <td id=\"T_56dd5_row65_col4\" class=\"data row65 col4\" >True</td>\n", - " <td id=\"T_56dd5_row65_col5\" class=\"data row65 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row65_col6\" class=\"data row65 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row65_col7\" class=\"data row65 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row65_col8\" class=\"data row65 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row66_col0\" class=\"data row66 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", - " <td id=\"T_56dd5_row66_col1\" class=\"data row66 col1\" >Cumulative Prediction Probabilities Drift</td>\n", - " <td id=\"T_56dd5_row66_col2\" class=\"data row66 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row66_col3\" class=\"data row66 col3\" >True</td>\n", - " <td id=\"T_56dd5_row66_col4\" class=\"data row66 col4\" >False</td>\n", - " <td id=\"T_56dd5_row66_col5\" class=\"data row66 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row66_col6\" class=\"data row66 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row66_col7\" class=\"data row66 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row66_col8\" class=\"data row66 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row67_col0\" class=\"data row67 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", - " <td id=\"T_56dd5_row67_col1\" class=\"data row67 col1\" >Prediction Probabilities Histogram Drift</td>\n", - " <td id=\"T_56dd5_row67_col2\" class=\"data row67 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row67_col3\" class=\"data row67 col3\" >True</td>\n", - " <td id=\"T_56dd5_row67_col4\" class=\"data row67 col4\" >True</td>\n", - " <td id=\"T_56dd5_row67_col5\" class=\"data row67 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row67_col6\" class=\"data row67 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row67_col7\" class=\"data row67 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row67_col8\" class=\"data row67 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row68_col0\" class=\"data row68 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_56dd5_row68_col1\" class=\"data row68 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_56dd5_row68_col2\" class=\"data row68 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row68_col3\" class=\"data row68 col3\" >True</td>\n", - " <td id=\"T_56dd5_row68_col4\" class=\"data row68 col4\" >False</td>\n", - " <td id=\"T_56dd5_row68_col5\" class=\"data row68 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row68_col6\" class=\"data row68 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row68_col7\" class=\"data row68 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row68_col8\" class=\"data row68 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row69_col0\" class=\"data row69 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", - " <td id=\"T_56dd5_row69_col1\" class=\"data row69 col1\" >Score Bands Drift</td>\n", - " <td id=\"T_56dd5_row69_col2\" class=\"data row69 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", - " <td id=\"T_56dd5_row69_col3\" class=\"data row69 col3\" >False</td>\n", - " <td id=\"T_56dd5_row69_col4\" class=\"data row69 col4\" >True</td>\n", - " <td id=\"T_56dd5_row69_col5\" class=\"data row69 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row69_col6\" class=\"data row69 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row69_col7\" class=\"data row69 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_56dd5_row69_col8\" class=\"data row69 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row70_col0\" class=\"data row70 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", - " <td id=\"T_56dd5_row70_col1\" class=\"data row70 col1\" >Scorecard Histogram Drift</td>\n", - " <td id=\"T_56dd5_row70_col2\" class=\"data row70 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", - " <td id=\"T_56dd5_row70_col3\" class=\"data row70 col3\" >True</td>\n", - " <td id=\"T_56dd5_row70_col4\" class=\"data row70 col4\" >True</td>\n", - " <td id=\"T_56dd5_row70_col5\" class=\"data row70 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row70_col6\" class=\"data row70 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row70_col7\" class=\"data row70 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_56dd5_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row71_col0\" class=\"data row71 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", - " <td id=\"T_56dd5_row71_col1\" class=\"data row71 col1\" >Accuracy</td>\n", - " <td id=\"T_56dd5_row71_col2\" class=\"data row71 col2\" >Calculates the accuracy of a model</td>\n", - " <td id=\"T_56dd5_row71_col3\" class=\"data row71 col3\" >False</td>\n", - " <td id=\"T_56dd5_row71_col4\" class=\"data row71 col4\" >False</td>\n", - " <td id=\"T_56dd5_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row71_col6\" class=\"data row71 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row71_col7\" class=\"data row71 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row71_col8\" class=\"data row71 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row72_col0\" class=\"data row72 col0\" >validmind.unit_metrics.classification.F1</td>\n", - " <td id=\"T_56dd5_row72_col1\" class=\"data row72 col1\" >F1</td>\n", - " <td id=\"T_56dd5_row72_col2\" class=\"data row72 col2\" >Calculates the F1 score for a classification model.</td>\n", - " <td id=\"T_56dd5_row72_col3\" class=\"data row72 col3\" >False</td>\n", - " <td id=\"T_56dd5_row72_col4\" class=\"data row72 col4\" >False</td>\n", - " <td id=\"T_56dd5_row72_col5\" class=\"data row72 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row72_col6\" class=\"data row72 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row72_col7\" class=\"data row72 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row72_col8\" class=\"data row72 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row73_col0\" class=\"data row73 col0\" >validmind.unit_metrics.classification.Precision</td>\n", - " <td id=\"T_56dd5_row73_col1\" class=\"data row73 col1\" >Precision</td>\n", - " <td id=\"T_56dd5_row73_col2\" class=\"data row73 col2\" >Calculates the precision for a classification model.</td>\n", - " <td id=\"T_56dd5_row73_col3\" class=\"data row73 col3\" >False</td>\n", - " <td id=\"T_56dd5_row73_col4\" class=\"data row73 col4\" >False</td>\n", - " <td id=\"T_56dd5_row73_col5\" class=\"data row73 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row73_col6\" class=\"data row73 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row73_col7\" class=\"data row73 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row73_col8\" class=\"data row73 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row74_col0\" class=\"data row74 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", - " <td id=\"T_56dd5_row74_col1\" class=\"data row74 col1\" >ROC AUC</td>\n", - " <td id=\"T_56dd5_row74_col2\" class=\"data row74 col2\" >Calculates the ROC AUC for a classification model.</td>\n", - " <td id=\"T_56dd5_row74_col3\" class=\"data row74 col3\" >False</td>\n", - " <td id=\"T_56dd5_row74_col4\" class=\"data row74 col4\" >False</td>\n", - " <td id=\"T_56dd5_row74_col5\" class=\"data row74 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row74_col6\" class=\"data row74 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row74_col7\" class=\"data row74 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row74_col8\" class=\"data row74 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row75_col0\" class=\"data row75 col0\" >validmind.unit_metrics.classification.Recall</td>\n", - " <td id=\"T_56dd5_row75_col1\" class=\"data row75 col1\" >Recall</td>\n", - " <td id=\"T_56dd5_row75_col2\" class=\"data row75 col2\" >Calculates the recall for a classification model.</td>\n", - " <td id=\"T_56dd5_row75_col3\" class=\"data row75 col3\" >False</td>\n", - " <td id=\"T_56dd5_row75_col4\" class=\"data row75 col4\" >False</td>\n", - " <td id=\"T_56dd5_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row75_col6\" class=\"data row75 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row75_col7\" class=\"data row75 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row75_col8\" class=\"data row75 col8\" >['classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library.\n", + "\n", + "`Tags` describe what a test applies to and help you filter tests for your use case. Examples include:\n", + "\n", + "- **llm:** Tests that work with Large Language Models.\n", + "- **nlp:** Tests relevant for natural language processing.\n", + "- **binary_classification:** Tests for binary classification tasks.\n", + "- **forecasting:** Tests for forecasting and time-series analysis.\n", + "- **tabular_data:** Tests for tabular data like CSVs and Excel spreadsheets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tags()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x10516c880>" + "execution_count": 4, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "['senstivity_analysis',\n", + " 'calibration',\n", + " 'clustering',\n", + " 'anomaly_detection',\n", + " 'nlp',\n", + " 'classification_metrics',\n", + " 'dimensionality_reduction',\n", + " 'tabular_data',\n", + " 'time_series_data',\n", + " 'model_predictions',\n", + " 'feature_selection',\n", + " 'correlation',\n", + " 'frequency_analysis',\n", + " 'embeddings',\n", + " 'regression',\n", + " 'llm',\n", + " 'statsmodels',\n", + " 'ragas',\n", + " 'model_performance',\n", + " 'model_validation',\n", + " 'rag_performance',\n", + " 'model_training',\n", + " 'qualitative',\n", + " 'classification',\n", + " 'kmeans',\n", + " 'multiclass_classification',\n", + " 'linear_regression',\n", + " 'data_quality',\n", + " 'text_data',\n", + " 'binary_classification',\n", + " 'threshold_optimization',\n", + " 'stationarity',\n", + " 'bias_and_fairness',\n", + " 'scorecard',\n", + " 'model_explainability',\n", + " 'model_comparison',\n", + " 'numerical_data',\n", + " 'sklearn',\n", + " 'model_selection',\n", + " 'retrieval_performance',\n", + " 'zero_shot',\n", + " 'statistical_test',\n", + " 'descriptive_statistics',\n", + " 'seasonality',\n", + " 'analysis',\n", + " 'data_validation',\n", + " 'data_distribution',\n", + " 'feature_importance',\n", + " 'metadata',\n", + " 'few_shot',\n", + " 'visualization',\n", + " 'credit_risk',\n", + " 'forecasting',\n", + " 'AUC',\n", + " 'logistic_regression',\n", + " 'model_diagnosis',\n", + " 'model_interpretation',\n", + " 'unit_root_test',\n", + " 'categorical_data',\n", + " 'data_analysis']" + ] + } + } ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use the `tags` parameter to find tests based on their tags, such as `model_performance` or `visualization`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_4d8bf th {\n", - " text-align: left;\n", - "}\n", - "#T_4d8bf_row0_col0, #T_4d8bf_row0_col1, #T_4d8bf_row0_col2, #T_4d8bf_row0_col3, #T_4d8bf_row0_col4, #T_4d8bf_row0_col5, #T_4d8bf_row0_col6, #T_4d8bf_row0_col7, #T_4d8bf_row0_col8, #T_4d8bf_row1_col0, #T_4d8bf_row1_col1, #T_4d8bf_row1_col2, #T_4d8bf_row1_col3, #T_4d8bf_row1_col4, #T_4d8bf_row1_col5, #T_4d8bf_row1_col6, #T_4d8bf_row1_col7, #T_4d8bf_row1_col8, #T_4d8bf_row2_col0, #T_4d8bf_row2_col1, #T_4d8bf_row2_col2, #T_4d8bf_row2_col3, #T_4d8bf_row2_col4, #T_4d8bf_row2_col5, #T_4d8bf_row2_col6, #T_4d8bf_row2_col7, #T_4d8bf_row2_col8, #T_4d8bf_row3_col0, #T_4d8bf_row3_col1, #T_4d8bf_row3_col2, #T_4d8bf_row3_col3, #T_4d8bf_row3_col4, #T_4d8bf_row3_col5, #T_4d8bf_row3_col6, #T_4d8bf_row3_col7, #T_4d8bf_row3_col8, #T_4d8bf_row4_col0, #T_4d8bf_row4_col1, #T_4d8bf_row4_col2, #T_4d8bf_row4_col3, #T_4d8bf_row4_col4, #T_4d8bf_row4_col5, #T_4d8bf_row4_col6, #T_4d8bf_row4_col7, #T_4d8bf_row4_col8, #T_4d8bf_row5_col0, #T_4d8bf_row5_col1, #T_4d8bf_row5_col2, #T_4d8bf_row5_col3, #T_4d8bf_row5_col4, #T_4d8bf_row5_col5, #T_4d8bf_row5_col6, #T_4d8bf_row5_col7, #T_4d8bf_row5_col8, #T_4d8bf_row6_col0, #T_4d8bf_row6_col1, #T_4d8bf_row6_col2, #T_4d8bf_row6_col3, #T_4d8bf_row6_col4, #T_4d8bf_row6_col5, #T_4d8bf_row6_col6, #T_4d8bf_row6_col7, #T_4d8bf_row6_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_4d8bf\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_4d8bf_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_4d8bf_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_4d8bf_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_4d8bf_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_4d8bf_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_4d8bf_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_4d8bf_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_4d8bf_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_4d8bf_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", - " <td id=\"T_4d8bf_row0_col1\" class=\"data row0 col1\" >Regression Residuals Plot</td>\n", - " <td id=\"T_4d8bf_row0_col2\" class=\"data row0 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", - " <td id=\"T_4d8bf_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row0_col5\" class=\"data row0 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row0_col6\" class=\"data row0 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_4d8bf_row0_col7\" class=\"data row0 col7\" >['model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_4d8bf_row1_col1\" class=\"data row1 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_4d8bf_row1_col2\" class=\"data row1 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_4d8bf_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row1_col4\" class=\"data row1 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_4d8bf_row1_col6\" class=\"data row1 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_4d8bf_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_4d8bf_row2_col1\" class=\"data row2 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_4d8bf_row2_col2\" class=\"data row2 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_4d8bf_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row2_col4\" class=\"data row2 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_4d8bf_row3_col1\" class=\"data row3 col1\" >ROC Curve</td>\n", - " <td id=\"T_4d8bf_row3_col2\" class=\"data row3 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_4d8bf_row3_col3\" class=\"data row3 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row3_col4\" class=\"data row3 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_4d8bf_row4_col1\" class=\"data row4 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_4d8bf_row4_col2\" class=\"data row4 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_4d8bf_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_4d8bf_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_4d8bf_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row4_col6\" class=\"data row4 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_4d8bf_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_4d8bf_row5_col1\" class=\"data row5 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_4d8bf_row5_col2\" class=\"data row5 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_4d8bf_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_4d8bf_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row5_col6\" class=\"data row5 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_4d8bf_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row6_col0\" class=\"data row6 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_4d8bf_row6_col1\" class=\"data row6 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_4d8bf_row6_col2\" class=\"data row6 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_4d8bf_row6_col3\" class=\"data row6 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row6_col4\" class=\"data row6 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row6_col5\" class=\"data row6 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row6_col6\" class=\"data row6 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row6_col7\" class=\"data row6 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row6_col8\" class=\"data row6 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, to match each task type with its related tags, use the [list_tasks_and_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) function:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tasks_and_tags()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x36a280f40>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_ac294 th {\n", + " text-align: left;\n", + "}\n", + "#T_ac294_row0_col0, #T_ac294_row0_col1, #T_ac294_row1_col0, #T_ac294_row1_col1, #T_ac294_row2_col0, #T_ac294_row2_col1, #T_ac294_row3_col0, #T_ac294_row3_col1, #T_ac294_row4_col0, #T_ac294_row4_col1, #T_ac294_row5_col0, #T_ac294_row5_col1, #T_ac294_row6_col0, #T_ac294_row6_col1, #T_ac294_row7_col0, #T_ac294_row7_col1, #T_ac294_row8_col0, #T_ac294_row8_col1, #T_ac294_row9_col0, #T_ac294_row9_col1, #T_ac294_row10_col0, #T_ac294_row10_col1, #T_ac294_row11_col0, #T_ac294_row11_col1, #T_ac294_row12_col0, #T_ac294_row12_col1, #T_ac294_row13_col0, #T_ac294_row13_col1 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_ac294\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_ac294_level0_col0\" class=\"col_heading level0 col0\" >Task</th>\n", + " <th id=\"T_ac294_level0_col1\" class=\"col_heading level0 col1\" >Tags</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_ac294_row0_col0\" class=\"data row0 col0\" >regression</td>\n", + " <td id=\"T_ac294_row0_col1\" class=\"data row0 col1\" >senstivity_analysis, tabular_data, time_series_data, model_predictions, feature_selection, correlation, regression, statsmodels, model_performance, model_training, multiclass_classification, linear_regression, data_quality, text_data, model_explainability, binary_classification, stationarity, bias_and_fairness, numerical_data, sklearn, model_selection, statistical_test, descriptive_statistics, seasonality, analysis, data_validation, data_distribution, metadata, feature_importance, visualization, forecasting, model_diagnosis, model_interpretation, unit_root_test, categorical_data, data_analysis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row1_col0\" class=\"data row1 col0\" >classification</td>\n", + " <td id=\"T_ac294_row1_col1\" class=\"data row1 col1\" >calibration, anomaly_detection, classification_metrics, tabular_data, time_series_data, feature_selection, correlation, statsmodels, model_performance, model_validation, model_training, classification, multiclass_classification, linear_regression, data_quality, text_data, binary_classification, threshold_optimization, bias_and_fairness, scorecard, model_comparison, numerical_data, sklearn, statistical_test, descriptive_statistics, feature_importance, data_distribution, metadata, visualization, credit_risk, AUC, logistic_regression, model_diagnosis, categorical_data, data_analysis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row2_col0\" class=\"data row2 col0\" >text_classification</td>\n", + " <td id=\"T_ac294_row2_col1\" class=\"data row2 col1\" >model_performance, feature_importance, multiclass_classification, few_shot, frequency_analysis, zero_shot, text_data, visualization, llm, binary_classification, ragas, model_diagnosis, model_comparison, sklearn, nlp, retrieval_performance, tabular_data, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row3_col0\" class=\"data row3 col0\" >text_summarization</td>\n", + " <td id=\"T_ac294_row3_col1\" class=\"data row3 col1\" >qualitative, few_shot, frequency_analysis, embeddings, zero_shot, text_data, visualization, llm, rag_performance, ragas, retrieval_performance, nlp, dimensionality_reduction, tabular_data, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row4_col0\" class=\"data row4 col0\" >data_validation</td>\n", + " <td id=\"T_ac294_row4_col1\" class=\"data row4 col1\" >stationarity, statsmodels, unit_root_test, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row5_col0\" class=\"data row5 col0\" >time_series_forecasting</td>\n", + " <td id=\"T_ac294_row5_col1\" class=\"data row5 col1\" >model_training, data_validation, metadata, visualization, model_explainability, sklearn, model_performance, model_predictions, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row6_col0\" class=\"data row6 col0\" >nlp</td>\n", + " <td id=\"T_ac294_row6_col1\" class=\"data row6 col1\" >data_validation, frequency_analysis, text_data, visualization, nlp</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row7_col0\" class=\"data row7 col0\" >clustering</td>\n", + " <td id=\"T_ac294_row7_col1\" class=\"data row7 col1\" >clustering, model_performance, kmeans, sklearn</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row8_col0\" class=\"data row8 col0\" >residual_analysis</td>\n", + " <td id=\"T_ac294_row8_col1\" class=\"data row8 col1\" >regression</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row9_col0\" class=\"data row9 col0\" >visualization</td>\n", + " <td id=\"T_ac294_row9_col1\" class=\"data row9 col1\" >regression</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row10_col0\" class=\"data row10 col0\" >feature_extraction</td>\n", + " <td id=\"T_ac294_row10_col1\" class=\"data row10 col1\" >embeddings, text_data, visualization, llm</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row11_col0\" class=\"data row11 col0\" >text_qa</td>\n", + " <td id=\"T_ac294_row11_col1\" class=\"data row11 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row12_col0\" class=\"data row12 col0\" >text_generation</td>\n", + " <td id=\"T_ac294_row12_col1\" class=\"data row12 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row13_col0\" class=\"data row13 col0\" >monitoring</td>\n", + " <td id=\"T_ac294_row13_col1\" class=\"data row13 col1\" >visualization</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x38000adc0>" + ] + } + } ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(tags=[\"model_performance\", \"visualization\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use `filter`, `task`, and `tags` together to create more specific queries.\n", - "\n", - "For example, apply all three to find tests compatible with `sklearn` models, designed for `classification` tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Filter tests by tags and task types\n", + "\n", + "While listing all tests is useful, you’ll often want to narrow your search. The [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) function supports `filter`, `task`, and `tags` parameters to assist in refining your results.\n", + "\n", + "Use the `filter` parameter to find tests that match a specific keyword, such as `sklearn`:" + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_36394 th {\n", - " text-align: left;\n", - "}\n", - "#T_36394_row0_col0, #T_36394_row0_col1, #T_36394_row0_col2, #T_36394_row0_col3, #T_36394_row0_col4, #T_36394_row0_col5, #T_36394_row0_col6, #T_36394_row0_col7, #T_36394_row0_col8, #T_36394_row1_col0, #T_36394_row1_col1, #T_36394_row1_col2, #T_36394_row1_col3, #T_36394_row1_col4, #T_36394_row1_col5, #T_36394_row1_col6, #T_36394_row1_col7, #T_36394_row1_col8, #T_36394_row2_col0, #T_36394_row2_col1, #T_36394_row2_col2, #T_36394_row2_col3, #T_36394_row2_col4, #T_36394_row2_col5, #T_36394_row2_col6, #T_36394_row2_col7, #T_36394_row2_col8, #T_36394_row3_col0, #T_36394_row3_col1, #T_36394_row3_col2, #T_36394_row3_col3, #T_36394_row3_col4, #T_36394_row3_col5, #T_36394_row3_col6, #T_36394_row3_col7, #T_36394_row3_col8, #T_36394_row4_col0, #T_36394_row4_col1, #T_36394_row4_col2, #T_36394_row4_col3, #T_36394_row4_col4, #T_36394_row4_col5, #T_36394_row4_col6, #T_36394_row4_col7, #T_36394_row4_col8, #T_36394_row5_col0, #T_36394_row5_col1, #T_36394_row5_col2, #T_36394_row5_col3, #T_36394_row5_col4, #T_36394_row5_col5, #T_36394_row5_col6, #T_36394_row5_col7, #T_36394_row5_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_36394\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_36394_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_36394_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_36394_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_36394_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_36394_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_36394_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_36394_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_36394_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_36394_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_36394_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_36394_row0_col1\" class=\"data row0 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_36394_row0_col2\" class=\"data row0 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_36394_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_36394_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_36394_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_36394_row0_col6\" class=\"data row0 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_36394_row0_col7\" class=\"data row0 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row0_col8\" class=\"data row0 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_36394_row1_col1\" class=\"data row1 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_36394_row1_col2\" class=\"data row1 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_36394_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_36394_row1_col4\" class=\"data row1 col4\" >False</td>\n", - " <td id=\"T_36394_row1_col5\" class=\"data row1 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_36394_row1_col6\" class=\"data row1 col6\" >{}</td>\n", - " <td id=\"T_36394_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_36394_row2_col1\" class=\"data row2 col1\" >ROC Curve</td>\n", - " <td id=\"T_36394_row2_col2\" class=\"data row2 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_36394_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_36394_row2_col4\" class=\"data row2 col4\" >False</td>\n", - " <td id=\"T_36394_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_36394_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_36394_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_36394_row3_col1\" class=\"data row3 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_36394_row3_col2\" class=\"data row3 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_36394_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_36394_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_36394_row3_col5\" class=\"data row3 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row3_col6\" class=\"data row3 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_36394_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row4_col0\" class=\"data row4 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_36394_row4_col1\" class=\"data row4 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_36394_row4_col2\" class=\"data row4 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_36394_row4_col3\" class=\"data row4 col3\" >True</td>\n", - " <td id=\"T_36394_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_36394_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_36394_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_36394_row5_col1\" class=\"data row5 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_36394_row5_col2\" class=\"data row5 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_36394_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_36394_row5_col4\" class=\"data row5 col4\" >False</td>\n", - " <td id=\"T_36394_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_36394_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tests(filter=\"sklearn\")" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x380009c40>" + "execution_count": 6, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_326c3 th {\n", + " text-align: left;\n", + "}\n", + "#T_326c3_row0_col0, #T_326c3_row0_col1, #T_326c3_row0_col2, #T_326c3_row0_col3, #T_326c3_row0_col4, #T_326c3_row0_col5, #T_326c3_row0_col6, #T_326c3_row0_col7, #T_326c3_row0_col8, #T_326c3_row1_col0, #T_326c3_row1_col1, #T_326c3_row1_col2, #T_326c3_row1_col3, #T_326c3_row1_col4, #T_326c3_row1_col5, #T_326c3_row1_col6, #T_326c3_row1_col7, #T_326c3_row1_col8, #T_326c3_row2_col0, #T_326c3_row2_col1, #T_326c3_row2_col2, #T_326c3_row2_col3, #T_326c3_row2_col4, #T_326c3_row2_col5, #T_326c3_row2_col6, #T_326c3_row2_col7, #T_326c3_row2_col8, #T_326c3_row3_col0, #T_326c3_row3_col1, #T_326c3_row3_col2, #T_326c3_row3_col3, #T_326c3_row3_col4, #T_326c3_row3_col5, #T_326c3_row3_col6, #T_326c3_row3_col7, #T_326c3_row3_col8, #T_326c3_row4_col0, #T_326c3_row4_col1, #T_326c3_row4_col2, #T_326c3_row4_col3, #T_326c3_row4_col4, #T_326c3_row4_col5, #T_326c3_row4_col6, #T_326c3_row4_col7, #T_326c3_row4_col8, #T_326c3_row5_col0, #T_326c3_row5_col1, #T_326c3_row5_col2, #T_326c3_row5_col3, #T_326c3_row5_col4, #T_326c3_row5_col5, #T_326c3_row5_col6, #T_326c3_row5_col7, #T_326c3_row5_col8, #T_326c3_row6_col0, #T_326c3_row6_col1, #T_326c3_row6_col2, #T_326c3_row6_col3, #T_326c3_row6_col4, #T_326c3_row6_col5, #T_326c3_row6_col6, #T_326c3_row6_col7, #T_326c3_row6_col8, #T_326c3_row7_col0, #T_326c3_row7_col1, #T_326c3_row7_col2, #T_326c3_row7_col3, #T_326c3_row7_col4, #T_326c3_row7_col5, #T_326c3_row7_col6, #T_326c3_row7_col7, #T_326c3_row7_col8, #T_326c3_row8_col0, #T_326c3_row8_col1, #T_326c3_row8_col2, #T_326c3_row8_col3, #T_326c3_row8_col4, #T_326c3_row8_col5, #T_326c3_row8_col6, #T_326c3_row8_col7, #T_326c3_row8_col8, #T_326c3_row9_col0, #T_326c3_row9_col1, #T_326c3_row9_col2, #T_326c3_row9_col3, #T_326c3_row9_col4, #T_326c3_row9_col5, #T_326c3_row9_col6, #T_326c3_row9_col7, #T_326c3_row9_col8, #T_326c3_row10_col0, #T_326c3_row10_col1, #T_326c3_row10_col2, #T_326c3_row10_col3, #T_326c3_row10_col4, #T_326c3_row10_col5, #T_326c3_row10_col6, #T_326c3_row10_col7, #T_326c3_row10_col8, #T_326c3_row11_col0, #T_326c3_row11_col1, #T_326c3_row11_col2, #T_326c3_row11_col3, #T_326c3_row11_col4, #T_326c3_row11_col5, #T_326c3_row11_col6, #T_326c3_row11_col7, #T_326c3_row11_col8, #T_326c3_row12_col0, #T_326c3_row12_col1, #T_326c3_row12_col2, #T_326c3_row12_col3, #T_326c3_row12_col4, #T_326c3_row12_col5, #T_326c3_row12_col6, #T_326c3_row12_col7, #T_326c3_row12_col8, #T_326c3_row13_col0, #T_326c3_row13_col1, #T_326c3_row13_col2, #T_326c3_row13_col3, #T_326c3_row13_col4, #T_326c3_row13_col5, #T_326c3_row13_col6, #T_326c3_row13_col7, #T_326c3_row13_col8, #T_326c3_row14_col0, #T_326c3_row14_col1, #T_326c3_row14_col2, #T_326c3_row14_col3, #T_326c3_row14_col4, #T_326c3_row14_col5, #T_326c3_row14_col6, #T_326c3_row14_col7, #T_326c3_row14_col8, #T_326c3_row15_col0, #T_326c3_row15_col1, #T_326c3_row15_col2, #T_326c3_row15_col3, #T_326c3_row15_col4, #T_326c3_row15_col5, #T_326c3_row15_col6, #T_326c3_row15_col7, #T_326c3_row15_col8, #T_326c3_row16_col0, #T_326c3_row16_col1, #T_326c3_row16_col2, #T_326c3_row16_col3, #T_326c3_row16_col4, #T_326c3_row16_col5, #T_326c3_row16_col6, #T_326c3_row16_col7, #T_326c3_row16_col8, #T_326c3_row17_col0, #T_326c3_row17_col1, #T_326c3_row17_col2, #T_326c3_row17_col3, #T_326c3_row17_col4, #T_326c3_row17_col5, #T_326c3_row17_col6, #T_326c3_row17_col7, #T_326c3_row17_col8, #T_326c3_row18_col0, #T_326c3_row18_col1, #T_326c3_row18_col2, #T_326c3_row18_col3, #T_326c3_row18_col4, #T_326c3_row18_col5, #T_326c3_row18_col6, #T_326c3_row18_col7, #T_326c3_row18_col8, #T_326c3_row19_col0, #T_326c3_row19_col1, #T_326c3_row19_col2, #T_326c3_row19_col3, #T_326c3_row19_col4, #T_326c3_row19_col5, #T_326c3_row19_col6, #T_326c3_row19_col7, #T_326c3_row19_col8, #T_326c3_row20_col0, #T_326c3_row20_col1, #T_326c3_row20_col2, #T_326c3_row20_col3, #T_326c3_row20_col4, #T_326c3_row20_col5, #T_326c3_row20_col6, #T_326c3_row20_col7, #T_326c3_row20_col8, #T_326c3_row21_col0, #T_326c3_row21_col1, #T_326c3_row21_col2, #T_326c3_row21_col3, #T_326c3_row21_col4, #T_326c3_row21_col5, #T_326c3_row21_col6, #T_326c3_row21_col7, #T_326c3_row21_col8, #T_326c3_row22_col0, #T_326c3_row22_col1, #T_326c3_row22_col2, #T_326c3_row22_col3, #T_326c3_row22_col4, #T_326c3_row22_col5, #T_326c3_row22_col6, #T_326c3_row22_col7, #T_326c3_row22_col8, #T_326c3_row23_col0, #T_326c3_row23_col1, #T_326c3_row23_col2, #T_326c3_row23_col3, #T_326c3_row23_col4, #T_326c3_row23_col5, #T_326c3_row23_col6, #T_326c3_row23_col7, #T_326c3_row23_col8, #T_326c3_row24_col0, #T_326c3_row24_col1, #T_326c3_row24_col2, #T_326c3_row24_col3, #T_326c3_row24_col4, #T_326c3_row24_col5, #T_326c3_row24_col6, #T_326c3_row24_col7, #T_326c3_row24_col8, #T_326c3_row25_col0, #T_326c3_row25_col1, #T_326c3_row25_col2, #T_326c3_row25_col3, #T_326c3_row25_col4, #T_326c3_row25_col5, #T_326c3_row25_col6, #T_326c3_row25_col7, #T_326c3_row25_col8, #T_326c3_row26_col0, #T_326c3_row26_col1, #T_326c3_row26_col2, #T_326c3_row26_col3, #T_326c3_row26_col4, #T_326c3_row26_col5, #T_326c3_row26_col6, #T_326c3_row26_col7, #T_326c3_row26_col8, #T_326c3_row27_col0, #T_326c3_row27_col1, #T_326c3_row27_col2, #T_326c3_row27_col3, #T_326c3_row27_col4, #T_326c3_row27_col5, #T_326c3_row27_col6, #T_326c3_row27_col7, #T_326c3_row27_col8, #T_326c3_row28_col0, #T_326c3_row28_col1, #T_326c3_row28_col2, #T_326c3_row28_col3, #T_326c3_row28_col4, #T_326c3_row28_col5, #T_326c3_row28_col6, #T_326c3_row28_col7, #T_326c3_row28_col8, #T_326c3_row29_col0, #T_326c3_row29_col1, #T_326c3_row29_col2, #T_326c3_row29_col3, #T_326c3_row29_col4, #T_326c3_row29_col5, #T_326c3_row29_col6, #T_326c3_row29_col7, #T_326c3_row29_col8, #T_326c3_row30_col0, #T_326c3_row30_col1, #T_326c3_row30_col2, #T_326c3_row30_col3, #T_326c3_row30_col4, #T_326c3_row30_col5, #T_326c3_row30_col6, #T_326c3_row30_col7, #T_326c3_row30_col8, #T_326c3_row31_col0, #T_326c3_row31_col1, #T_326c3_row31_col2, #T_326c3_row31_col3, #T_326c3_row31_col4, #T_326c3_row31_col5, #T_326c3_row31_col6, #T_326c3_row31_col7, #T_326c3_row31_col8, #T_326c3_row32_col0, #T_326c3_row32_col1, #T_326c3_row32_col2, #T_326c3_row32_col3, #T_326c3_row32_col4, #T_326c3_row32_col5, #T_326c3_row32_col6, #T_326c3_row32_col7, #T_326c3_row32_col8, #T_326c3_row33_col0, #T_326c3_row33_col1, #T_326c3_row33_col2, #T_326c3_row33_col3, #T_326c3_row33_col4, #T_326c3_row33_col5, #T_326c3_row33_col6, #T_326c3_row33_col7, #T_326c3_row33_col8, #T_326c3_row34_col0, #T_326c3_row34_col1, #T_326c3_row34_col2, #T_326c3_row34_col3, #T_326c3_row34_col4, #T_326c3_row34_col5, #T_326c3_row34_col6, #T_326c3_row34_col7, #T_326c3_row34_col8, #T_326c3_row35_col0, #T_326c3_row35_col1, #T_326c3_row35_col2, #T_326c3_row35_col3, #T_326c3_row35_col4, #T_326c3_row35_col5, #T_326c3_row35_col6, #T_326c3_row35_col7, #T_326c3_row35_col8, #T_326c3_row36_col0, #T_326c3_row36_col1, #T_326c3_row36_col2, #T_326c3_row36_col3, #T_326c3_row36_col4, #T_326c3_row36_col5, #T_326c3_row36_col6, #T_326c3_row36_col7, #T_326c3_row36_col8, #T_326c3_row37_col0, #T_326c3_row37_col1, #T_326c3_row37_col2, #T_326c3_row37_col3, #T_326c3_row37_col4, #T_326c3_row37_col5, #T_326c3_row37_col6, #T_326c3_row37_col7, #T_326c3_row37_col8, #T_326c3_row38_col0, #T_326c3_row38_col1, #T_326c3_row38_col2, #T_326c3_row38_col3, #T_326c3_row38_col4, #T_326c3_row38_col5, #T_326c3_row38_col6, #T_326c3_row38_col7, #T_326c3_row38_col8, #T_326c3_row39_col0, #T_326c3_row39_col1, #T_326c3_row39_col2, #T_326c3_row39_col3, #T_326c3_row39_col4, #T_326c3_row39_col5, #T_326c3_row39_col6, #T_326c3_row39_col7, #T_326c3_row39_col8, #T_326c3_row40_col0, #T_326c3_row40_col1, #T_326c3_row40_col2, #T_326c3_row40_col3, #T_326c3_row40_col4, #T_326c3_row40_col5, #T_326c3_row40_col6, #T_326c3_row40_col7, #T_326c3_row40_col8, #T_326c3_row41_col0, #T_326c3_row41_col1, #T_326c3_row41_col2, #T_326c3_row41_col3, #T_326c3_row41_col4, #T_326c3_row41_col5, #T_326c3_row41_col6, #T_326c3_row41_col7, #T_326c3_row41_col8, #T_326c3_row42_col0, #T_326c3_row42_col1, #T_326c3_row42_col2, #T_326c3_row42_col3, #T_326c3_row42_col4, #T_326c3_row42_col5, #T_326c3_row42_col6, #T_326c3_row42_col7, #T_326c3_row42_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_326c3\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_326c3_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_326c3_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_326c3_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_326c3_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_326c3_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_326c3_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_326c3_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_326c3_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_326c3_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_326c3_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " <td id=\"T_326c3_row0_col1\" class=\"data row0 col1\" >Cluster Size Distribution</td>\n", + " <td id=\"T_326c3_row0_col2\" class=\"data row0 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", + " <td id=\"T_326c3_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_326c3_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_326c3_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_326c3_row0_col7\" class=\"data row0 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row0_col8\" class=\"data row0 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", + " <td id=\"T_326c3_row1_col1\" class=\"data row1 col1\" >Time Series R2 Square By Segments</td>\n", + " <td id=\"T_326c3_row1_col2\" class=\"data row1 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", + " <td id=\"T_326c3_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_326c3_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_326c3_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row1_col6\" class=\"data row1 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row1_col7\" class=\"data row1 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row1_col8\" class=\"data row1 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", + " <td id=\"T_326c3_row2_col1\" class=\"data row2 col1\" >Adjusted Mutual Information</td>\n", + " <td id=\"T_326c3_row2_col2\" class=\"data row2 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", + " <td id=\"T_326c3_row2_col3\" class=\"data row2 col3\" >False</td>\n", + " <td id=\"T_326c3_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_326c3_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_326c3_row2_col7\" class=\"data row2 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row2_col8\" class=\"data row2 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", + " <td id=\"T_326c3_row3_col1\" class=\"data row3 col1\" >Adjusted Rand Index</td>\n", + " <td id=\"T_326c3_row3_col2\" class=\"data row3 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", + " <td id=\"T_326c3_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_326c3_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_326c3_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_326c3_row3_col7\" class=\"data row3 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row3_col8\" class=\"data row3 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_326c3_row4_col1\" class=\"data row4 col1\" >Calibration Curve</td>\n", + " <td id=\"T_326c3_row4_col2\" class=\"data row4 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_326c3_row4_col3\" class=\"data row4 col3\" >True</td>\n", + " <td id=\"T_326c3_row4_col4\" class=\"data row4 col4\" >False</td>\n", + " <td id=\"T_326c3_row4_col5\" class=\"data row4 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_326c3_row4_col7\" class=\"data row4 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_326c3_row4_col8\" class=\"data row4 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row5_col0\" class=\"data row5 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_326c3_row5_col1\" class=\"data row5 col1\" >Classifier Performance</td>\n", + " <td id=\"T_326c3_row5_col2\" class=\"data row5 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_326c3_row5_col3\" class=\"data row5 col3\" >False</td>\n", + " <td id=\"T_326c3_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_326c3_row5_col5\" class=\"data row5 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row5_col6\" class=\"data row5 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_326c3_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row6_col0\" class=\"data row6 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_326c3_row6_col1\" class=\"data row6 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_326c3_row6_col2\" class=\"data row6 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_326c3_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_326c3_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_326c3_row6_col5\" class=\"data row6 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row6_col6\" class=\"data row6 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row6_col7\" class=\"data row6 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_326c3_row6_col8\" class=\"data row6 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row7_col0\" class=\"data row7 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", + " <td id=\"T_326c3_row7_col1\" class=\"data row7 col1\" >Cluster Cosine Similarity</td>\n", + " <td id=\"T_326c3_row7_col2\" class=\"data row7 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", + " <td id=\"T_326c3_row7_col3\" class=\"data row7 col3\" >False</td>\n", + " <td id=\"T_326c3_row7_col4\" class=\"data row7 col4\" >True</td>\n", + " <td id=\"T_326c3_row7_col5\" class=\"data row7 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row7_col6\" class=\"data row7 col6\" >{}</td>\n", + " <td id=\"T_326c3_row7_col7\" class=\"data row7 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row7_col8\" class=\"data row7 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row8_col0\" class=\"data row8 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", + " <td id=\"T_326c3_row8_col1\" class=\"data row8 col1\" >Cluster Performance Metrics</td>\n", + " <td id=\"T_326c3_row8_col2\" class=\"data row8 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", + " <td id=\"T_326c3_row8_col3\" class=\"data row8 col3\" >False</td>\n", + " <td id=\"T_326c3_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_326c3_row8_col5\" class=\"data row8 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row8_col6\" class=\"data row8 col6\" >{}</td>\n", + " <td id=\"T_326c3_row8_col7\" class=\"data row8 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row8_col8\" class=\"data row8 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row9_col0\" class=\"data row9 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", + " <td id=\"T_326c3_row9_col1\" class=\"data row9 col1\" >Completeness Score</td>\n", + " <td id=\"T_326c3_row9_col2\" class=\"data row9 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", + " <td id=\"T_326c3_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_326c3_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_326c3_row9_col5\" class=\"data row9 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row9_col6\" class=\"data row9 col6\" >{}</td>\n", + " <td id=\"T_326c3_row9_col7\" class=\"data row9 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row9_col8\" class=\"data row9 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row10_col0\" class=\"data row10 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_326c3_row10_col1\" class=\"data row10 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_326c3_row10_col2\" class=\"data row10 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_326c3_row10_col3\" class=\"data row10 col3\" >True</td>\n", + " <td id=\"T_326c3_row10_col4\" class=\"data row10 col4\" >False</td>\n", + " <td id=\"T_326c3_row10_col5\" class=\"data row10 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row10_col7\" class=\"data row10 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row10_col8\" class=\"data row10 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row11_col0\" class=\"data row11 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", + " <td id=\"T_326c3_row11_col1\" class=\"data row11 col1\" >Feature Importance</td>\n", + " <td id=\"T_326c3_row11_col2\" class=\"data row11 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", + " <td id=\"T_326c3_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_326c3_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_326c3_row11_col5\" class=\"data row11 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row11_col6\" class=\"data row11 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_326c3_row11_col7\" class=\"data row11 col7\" >['model_explainability', 'sklearn']</td>\n", + " <td id=\"T_326c3_row11_col8\" class=\"data row11 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row12_col0\" class=\"data row12 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", + " <td id=\"T_326c3_row12_col1\" class=\"data row12 col1\" >Fowlkes Mallows Score</td>\n", + " <td id=\"T_326c3_row12_col2\" class=\"data row12 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", + " <td id=\"T_326c3_row12_col3\" class=\"data row12 col3\" >False</td>\n", + " <td id=\"T_326c3_row12_col4\" class=\"data row12 col4\" >True</td>\n", + " <td id=\"T_326c3_row12_col5\" class=\"data row12 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row12_col6\" class=\"data row12 col6\" >{}</td>\n", + " <td id=\"T_326c3_row12_col7\" class=\"data row12 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row12_col8\" class=\"data row12 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row13_col0\" class=\"data row13 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", + " <td id=\"T_326c3_row13_col1\" class=\"data row13 col1\" >Homogeneity Score</td>\n", + " <td id=\"T_326c3_row13_col2\" class=\"data row13 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", + " <td id=\"T_326c3_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_326c3_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_326c3_row13_col5\" class=\"data row13 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row13_col6\" class=\"data row13 col6\" >{}</td>\n", + " <td id=\"T_326c3_row13_col7\" class=\"data row13 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row13_col8\" class=\"data row13 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row14_col0\" class=\"data row14 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_326c3_row14_col1\" class=\"data row14 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_326c3_row14_col2\" class=\"data row14 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_326c3_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_326c3_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_326c3_row14_col5\" class=\"data row14 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row14_col6\" class=\"data row14 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_326c3_row14_col7\" class=\"data row14 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row14_col8\" class=\"data row14 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row15_col0\" class=\"data row15 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " <td id=\"T_326c3_row15_col1\" class=\"data row15 col1\" >K Means Clusters Optimization</td>\n", + " <td id=\"T_326c3_row15_col2\" class=\"data row15 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", + " <td id=\"T_326c3_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_326c3_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_326c3_row15_col5\" class=\"data row15 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row15_col6\" class=\"data row15 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row15_col7\" class=\"data row15 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", + " <td id=\"T_326c3_row15_col8\" class=\"data row15 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row16_col0\" class=\"data row16 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_326c3_row16_col1\" class=\"data row16 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_326c3_row16_col2\" class=\"data row16 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_326c3_row16_col3\" class=\"data row16 col3\" >False</td>\n", + " <td id=\"T_326c3_row16_col4\" class=\"data row16 col4\" >True</td>\n", + " <td id=\"T_326c3_row16_col5\" class=\"data row16 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_326c3_row16_col7\" class=\"data row16 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row16_col8\" class=\"data row16 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row17_col0\" class=\"data row17 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_326c3_row17_col1\" class=\"data row17 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_326c3_row17_col2\" class=\"data row17 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_326c3_row17_col3\" class=\"data row17 col3\" >False</td>\n", + " <td id=\"T_326c3_row17_col4\" class=\"data row17 col4\" >True</td>\n", + " <td id=\"T_326c3_row17_col5\" class=\"data row17 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row17_col6\" class=\"data row17 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row17_col7\" class=\"data row17 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row17_col8\" class=\"data row17 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row18_col0\" class=\"data row18 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_326c3_row18_col1\" class=\"data row18 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_326c3_row18_col2\" class=\"data row18 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_326c3_row18_col3\" class=\"data row18 col3\" >False</td>\n", + " <td id=\"T_326c3_row18_col4\" class=\"data row18 col4\" >True</td>\n", + " <td id=\"T_326c3_row18_col5\" class=\"data row18 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row18_col6\" class=\"data row18 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row18_col7\" class=\"data row18 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row18_col8\" class=\"data row18 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row19_col0\" class=\"data row19 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_326c3_row19_col1\" class=\"data row19 col1\" >Model Parameters</td>\n", + " <td id=\"T_326c3_row19_col2\" class=\"data row19 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_326c3_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_326c3_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_326c3_row19_col5\" class=\"data row19 col5\" >['model']</td>\n", + " <td id=\"T_326c3_row19_col6\" class=\"data row19 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row19_col7\" class=\"data row19 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_326c3_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row20_col0\" class=\"data row20 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_326c3_row20_col1\" class=\"data row20 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_326c3_row20_col2\" class=\"data row20 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_326c3_row20_col3\" class=\"data row20 col3\" >False</td>\n", + " <td id=\"T_326c3_row20_col4\" class=\"data row20 col4\" >True</td>\n", + " <td id=\"T_326c3_row20_col5\" class=\"data row20 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_326c3_row20_col6\" class=\"data row20 col6\" >{}</td>\n", + " <td id=\"T_326c3_row20_col7\" class=\"data row20 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_326c3_row20_col8\" class=\"data row20 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row21_col0\" class=\"data row21 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_326c3_row21_col1\" class=\"data row21 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_326c3_row21_col2\" class=\"data row21 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_326c3_row21_col3\" class=\"data row21 col3\" >True</td>\n", + " <td id=\"T_326c3_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_326c3_row21_col5\" class=\"data row21 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_326c3_row21_col6\" class=\"data row21 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_326c3_row21_col7\" class=\"data row21 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_326c3_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row22_col0\" class=\"data row22 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_326c3_row22_col1\" class=\"data row22 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_326c3_row22_col2\" class=\"data row22 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_326c3_row22_col3\" class=\"data row22 col3\" >True</td>\n", + " <td id=\"T_326c3_row22_col4\" class=\"data row22 col4\" >False</td>\n", + " <td id=\"T_326c3_row22_col5\" class=\"data row22 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row22_col6\" class=\"data row22 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row22_col7\" class=\"data row22 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_326c3_row22_col8\" class=\"data row22 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row23_col0\" class=\"data row23 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_326c3_row23_col1\" class=\"data row23 col1\" >Population Stability Index</td>\n", + " <td id=\"T_326c3_row23_col2\" class=\"data row23 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_326c3_row23_col3\" class=\"data row23 col3\" >True</td>\n", + " <td id=\"T_326c3_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_326c3_row23_col5\" class=\"data row23 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row23_col6\" class=\"data row23 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_326c3_row23_col7\" class=\"data row23 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row23_col8\" class=\"data row23 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row24_col0\" class=\"data row24 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_326c3_row24_col1\" class=\"data row24 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_326c3_row24_col2\" class=\"data row24 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_326c3_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_326c3_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_326c3_row24_col5\" class=\"data row24 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row24_col6\" class=\"data row24 col6\" >{}</td>\n", + " <td id=\"T_326c3_row24_col7\" class=\"data row24 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row24_col8\" class=\"data row24 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row25_col0\" class=\"data row25 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_326c3_row25_col1\" class=\"data row25 col1\" >ROC Curve</td>\n", + " <td id=\"T_326c3_row25_col2\" class=\"data row25 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_326c3_row25_col3\" class=\"data row25 col3\" >True</td>\n", + " <td id=\"T_326c3_row25_col4\" class=\"data row25 col4\" >False</td>\n", + " <td id=\"T_326c3_row25_col5\" class=\"data row25 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row25_col6\" class=\"data row25 col6\" >{}</td>\n", + " <td id=\"T_326c3_row25_col7\" class=\"data row25 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row25_col8\" class=\"data row25 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row26_col0\" class=\"data row26 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_326c3_row26_col1\" class=\"data row26 col1\" >Regression Errors</td>\n", + " <td id=\"T_326c3_row26_col2\" class=\"data row26 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_326c3_row26_col3\" class=\"data row26 col3\" >False</td>\n", + " <td id=\"T_326c3_row26_col4\" class=\"data row26 col4\" >True</td>\n", + " <td id=\"T_326c3_row26_col5\" class=\"data row26 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row26_col6\" class=\"data row26 col6\" >{}</td>\n", + " <td id=\"T_326c3_row26_col7\" class=\"data row26 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row26_col8\" class=\"data row26 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row27_col0\" class=\"data row27 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", + " <td id=\"T_326c3_row27_col1\" class=\"data row27 col1\" >Regression Errors Comparison</td>\n", + " <td id=\"T_326c3_row27_col2\" class=\"data row27 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", + " <td id=\"T_326c3_row27_col3\" class=\"data row27 col3\" >False</td>\n", + " <td id=\"T_326c3_row27_col4\" class=\"data row27 col4\" >True</td>\n", + " <td id=\"T_326c3_row27_col5\" class=\"data row27 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_326c3_row27_col6\" class=\"data row27 col6\" >{}</td>\n", + " <td id=\"T_326c3_row27_col7\" class=\"data row27 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row27_col8\" class=\"data row27 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row28_col0\" class=\"data row28 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", + " <td id=\"T_326c3_row28_col1\" class=\"data row28 col1\" >Regression Performance</td>\n", + " <td id=\"T_326c3_row28_col2\" class=\"data row28 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", + " <td id=\"T_326c3_row28_col3\" class=\"data row28 col3\" >False</td>\n", + " <td id=\"T_326c3_row28_col4\" class=\"data row28 col4\" >True</td>\n", + " <td id=\"T_326c3_row28_col5\" class=\"data row28 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_326c3_row28_col7\" class=\"data row28 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row28_col8\" class=\"data row28 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row29_col0\" class=\"data row29 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " <td id=\"T_326c3_row29_col1\" class=\"data row29 col1\" >Regression R2 Square</td>\n", + " <td id=\"T_326c3_row29_col2\" class=\"data row29 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", + " <td id=\"T_326c3_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_326c3_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_326c3_row29_col5\" class=\"data row29 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row29_col6\" class=\"data row29 col6\" >{}</td>\n", + " <td id=\"T_326c3_row29_col7\" class=\"data row29 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row30_col0\" class=\"data row30 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", + " <td id=\"T_326c3_row30_col1\" class=\"data row30 col1\" >Regression R2 Square Comparison</td>\n", + " <td id=\"T_326c3_row30_col2\" class=\"data row30 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", + " <td id=\"T_326c3_row30_col3\" class=\"data row30 col3\" >False</td>\n", + " <td id=\"T_326c3_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_326c3_row30_col5\" class=\"data row30 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_326c3_row30_col6\" class=\"data row30 col6\" >{}</td>\n", + " <td id=\"T_326c3_row30_col7\" class=\"data row30 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row30_col8\" class=\"data row30 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row31_col0\" class=\"data row31 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_326c3_row31_col1\" class=\"data row31 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_326c3_row31_col2\" class=\"data row31 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_326c3_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_326c3_row31_col4\" class=\"data row31 col4\" >True</td>\n", + " <td id=\"T_326c3_row31_col5\" class=\"data row31 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row31_col6\" class=\"data row31 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_326c3_row31_col7\" class=\"data row31 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_326c3_row31_col8\" class=\"data row31 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row32_col0\" class=\"data row32 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_326c3_row32_col1\" class=\"data row32 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_326c3_row32_col2\" class=\"data row32 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_326c3_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_326c3_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_326c3_row32_col5\" class=\"data row32 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row32_col6\" class=\"data row32 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row32_col7\" class=\"data row32 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_326c3_row32_col8\" class=\"data row32 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_326c3_row33_col1\" class=\"data row33 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_326c3_row33_col2\" class=\"data row33 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_326c3_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_326c3_row33_col4\" class=\"data row33 col4\" >True</td>\n", + " <td id=\"T_326c3_row33_col5\" class=\"data row33 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row33_col6\" class=\"data row33 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_326c3_row33_col7\" class=\"data row33 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_326c3_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " <td id=\"T_326c3_row34_col1\" class=\"data row34 col1\" >Silhouette Plot</td>\n", + " <td id=\"T_326c3_row34_col2\" class=\"data row34 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", + " <td id=\"T_326c3_row34_col3\" class=\"data row34 col3\" >True</td>\n", + " <td id=\"T_326c3_row34_col4\" class=\"data row34 col4\" >True</td>\n", + " <td id=\"T_326c3_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row34_col6\" class=\"data row34 col6\" >{}</td>\n", + " <td id=\"T_326c3_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row34_col8\" class=\"data row34 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_326c3_row35_col1\" class=\"data row35 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_326c3_row35_col2\" class=\"data row35 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_326c3_row35_col3\" class=\"data row35 col3\" >False</td>\n", + " <td id=\"T_326c3_row35_col4\" class=\"data row35 col4\" >True</td>\n", + " <td id=\"T_326c3_row35_col5\" class=\"data row35 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row35_col6\" class=\"data row35 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_326c3_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", + " <td id=\"T_326c3_row36_col1\" class=\"data row36 col1\" >V Measure</td>\n", + " <td id=\"T_326c3_row36_col2\" class=\"data row36 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", + " <td id=\"T_326c3_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_326c3_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_326c3_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row36_col6\" class=\"data row36 col6\" >{}</td>\n", + " <td id=\"T_326c3_row36_col7\" class=\"data row36 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row36_col8\" class=\"data row36 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_326c3_row37_col1\" class=\"data row37 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_326c3_row37_col2\" class=\"data row37 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_326c3_row37_col3\" class=\"data row37 col3\" >True</td>\n", + " <td id=\"T_326c3_row37_col4\" class=\"data row37 col4\" >True</td>\n", + " <td id=\"T_326c3_row37_col5\" class=\"data row37 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row37_col6\" class=\"data row37 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_326c3_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row38_col0\" class=\"data row38 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_326c3_row38_col1\" class=\"data row38 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_326c3_row38_col2\" class=\"data row38 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row38_col3\" class=\"data row38 col3\" >True</td>\n", + " <td id=\"T_326c3_row38_col4\" class=\"data row38 col4\" >True</td>\n", + " <td id=\"T_326c3_row38_col5\" class=\"data row38 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row38_col6\" class=\"data row38 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row38_col7\" class=\"data row38 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row38_col8\" class=\"data row38 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row39_col0\" class=\"data row39 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_326c3_row39_col1\" class=\"data row39 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_326c3_row39_col2\" class=\"data row39 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row39_col3\" class=\"data row39 col3\" >False</td>\n", + " <td id=\"T_326c3_row39_col4\" class=\"data row39 col4\" >True</td>\n", + " <td id=\"T_326c3_row39_col5\" class=\"data row39 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row39_col6\" class=\"data row39 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row40_col0\" class=\"data row40 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_326c3_row40_col1\" class=\"data row40 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_326c3_row40_col2\" class=\"data row40 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row40_col3\" class=\"data row40 col3\" >False</td>\n", + " <td id=\"T_326c3_row40_col4\" class=\"data row40 col4\" >True</td>\n", + " <td id=\"T_326c3_row40_col5\" class=\"data row40 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row40_col6\" class=\"data row40 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row41_col0\" class=\"data row41 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_326c3_row41_col1\" class=\"data row41 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_326c3_row41_col2\" class=\"data row41 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_326c3_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_326c3_row41_col5\" class=\"data row41 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row41_col6\" class=\"data row41 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row42_col0\" class=\"data row42 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_326c3_row42_col1\" class=\"data row42 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_326c3_row42_col2\" class=\"data row42 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row42_col3\" class=\"data row42 col3\" >True</td>\n", + " <td id=\"T_326c3_row42_col4\" class=\"data row42 col4\" >False</td>\n", + " <td id=\"T_326c3_row42_col5\" class=\"data row42 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row42_col6\" class=\"data row42 col6\" >{}</td>\n", + " <td id=\"T_326c3_row42_col7\" class=\"data row42 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row42_col8\" class=\"data row42 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x1052e6790>" + ] + } + } ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(filter=\"sklearn\",\n", - " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Store test sets for use\n", - "\n", - "Once you've identified specific sets of tests you'd like to run, you can store the tests in variables, enabling you to easily reuse those tests in later steps.\n", - "\n", - "For example, if you're validating a summarization model, use [`list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all tests tagged for text summarization and save them to `text_summarization_tests` for later use:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `task` parameter to find tests that match a specific task type, such as `classification`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tests(task=\"classification\")" + ], + "execution_count": 7, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_56dd5 th {\n", + " text-align: left;\n", + "}\n", + "#T_56dd5_row0_col0, #T_56dd5_row0_col1, #T_56dd5_row0_col2, #T_56dd5_row0_col3, #T_56dd5_row0_col4, #T_56dd5_row0_col5, #T_56dd5_row0_col6, #T_56dd5_row0_col7, #T_56dd5_row0_col8, #T_56dd5_row1_col0, #T_56dd5_row1_col1, #T_56dd5_row1_col2, #T_56dd5_row1_col3, #T_56dd5_row1_col4, #T_56dd5_row1_col5, #T_56dd5_row1_col6, #T_56dd5_row1_col7, #T_56dd5_row1_col8, #T_56dd5_row2_col0, #T_56dd5_row2_col1, #T_56dd5_row2_col2, #T_56dd5_row2_col3, #T_56dd5_row2_col4, #T_56dd5_row2_col5, #T_56dd5_row2_col6, #T_56dd5_row2_col7, #T_56dd5_row2_col8, #T_56dd5_row3_col0, #T_56dd5_row3_col1, #T_56dd5_row3_col2, #T_56dd5_row3_col3, #T_56dd5_row3_col4, #T_56dd5_row3_col5, #T_56dd5_row3_col6, #T_56dd5_row3_col7, #T_56dd5_row3_col8, #T_56dd5_row4_col0, #T_56dd5_row4_col1, #T_56dd5_row4_col2, #T_56dd5_row4_col3, #T_56dd5_row4_col4, #T_56dd5_row4_col5, #T_56dd5_row4_col6, #T_56dd5_row4_col7, #T_56dd5_row4_col8, #T_56dd5_row5_col0, #T_56dd5_row5_col1, #T_56dd5_row5_col2, #T_56dd5_row5_col3, #T_56dd5_row5_col4, #T_56dd5_row5_col5, #T_56dd5_row5_col6, #T_56dd5_row5_col7, #T_56dd5_row5_col8, #T_56dd5_row6_col0, #T_56dd5_row6_col1, #T_56dd5_row6_col2, #T_56dd5_row6_col3, #T_56dd5_row6_col4, #T_56dd5_row6_col5, #T_56dd5_row6_col6, #T_56dd5_row6_col7, #T_56dd5_row6_col8, #T_56dd5_row7_col0, #T_56dd5_row7_col1, #T_56dd5_row7_col2, #T_56dd5_row7_col3, #T_56dd5_row7_col4, #T_56dd5_row7_col5, #T_56dd5_row7_col6, #T_56dd5_row7_col7, #T_56dd5_row7_col8, #T_56dd5_row8_col0, #T_56dd5_row8_col1, #T_56dd5_row8_col2, #T_56dd5_row8_col3, #T_56dd5_row8_col4, #T_56dd5_row8_col5, #T_56dd5_row8_col6, #T_56dd5_row8_col7, #T_56dd5_row8_col8, #T_56dd5_row9_col0, #T_56dd5_row9_col1, #T_56dd5_row9_col2, #T_56dd5_row9_col3, #T_56dd5_row9_col4, #T_56dd5_row9_col5, #T_56dd5_row9_col6, #T_56dd5_row9_col7, #T_56dd5_row9_col8, #T_56dd5_row10_col0, #T_56dd5_row10_col1, #T_56dd5_row10_col2, #T_56dd5_row10_col3, #T_56dd5_row10_col4, #T_56dd5_row10_col5, #T_56dd5_row10_col6, #T_56dd5_row10_col7, #T_56dd5_row10_col8, #T_56dd5_row11_col0, #T_56dd5_row11_col1, #T_56dd5_row11_col2, #T_56dd5_row11_col3, #T_56dd5_row11_col4, #T_56dd5_row11_col5, #T_56dd5_row11_col6, #T_56dd5_row11_col7, #T_56dd5_row11_col8, #T_56dd5_row12_col0, #T_56dd5_row12_col1, #T_56dd5_row12_col2, #T_56dd5_row12_col3, #T_56dd5_row12_col4, #T_56dd5_row12_col5, #T_56dd5_row12_col6, #T_56dd5_row12_col7, #T_56dd5_row12_col8, #T_56dd5_row13_col0, #T_56dd5_row13_col1, #T_56dd5_row13_col2, #T_56dd5_row13_col3, #T_56dd5_row13_col4, #T_56dd5_row13_col5, #T_56dd5_row13_col6, #T_56dd5_row13_col7, #T_56dd5_row13_col8, #T_56dd5_row14_col0, #T_56dd5_row14_col1, #T_56dd5_row14_col2, #T_56dd5_row14_col3, #T_56dd5_row14_col4, #T_56dd5_row14_col5, #T_56dd5_row14_col6, #T_56dd5_row14_col7, #T_56dd5_row14_col8, #T_56dd5_row15_col0, #T_56dd5_row15_col1, #T_56dd5_row15_col2, #T_56dd5_row15_col3, #T_56dd5_row15_col4, #T_56dd5_row15_col5, #T_56dd5_row15_col6, #T_56dd5_row15_col7, #T_56dd5_row15_col8, #T_56dd5_row16_col0, #T_56dd5_row16_col1, #T_56dd5_row16_col2, #T_56dd5_row16_col3, #T_56dd5_row16_col4, #T_56dd5_row16_col5, #T_56dd5_row16_col6, #T_56dd5_row16_col7, #T_56dd5_row16_col8, #T_56dd5_row17_col0, #T_56dd5_row17_col1, #T_56dd5_row17_col2, #T_56dd5_row17_col3, #T_56dd5_row17_col4, #T_56dd5_row17_col5, #T_56dd5_row17_col6, #T_56dd5_row17_col7, #T_56dd5_row17_col8, #T_56dd5_row18_col0, #T_56dd5_row18_col1, #T_56dd5_row18_col2, #T_56dd5_row18_col3, #T_56dd5_row18_col4, #T_56dd5_row18_col5, #T_56dd5_row18_col6, #T_56dd5_row18_col7, #T_56dd5_row18_col8, #T_56dd5_row19_col0, #T_56dd5_row19_col1, #T_56dd5_row19_col2, #T_56dd5_row19_col3, #T_56dd5_row19_col4, #T_56dd5_row19_col5, #T_56dd5_row19_col6, #T_56dd5_row19_col7, #T_56dd5_row19_col8, #T_56dd5_row20_col0, #T_56dd5_row20_col1, #T_56dd5_row20_col2, #T_56dd5_row20_col3, #T_56dd5_row20_col4, #T_56dd5_row20_col5, #T_56dd5_row20_col6, #T_56dd5_row20_col7, #T_56dd5_row20_col8, #T_56dd5_row21_col0, #T_56dd5_row21_col1, #T_56dd5_row21_col2, #T_56dd5_row21_col3, #T_56dd5_row21_col4, #T_56dd5_row21_col5, #T_56dd5_row21_col6, #T_56dd5_row21_col7, #T_56dd5_row21_col8, #T_56dd5_row22_col0, #T_56dd5_row22_col1, #T_56dd5_row22_col2, #T_56dd5_row22_col3, #T_56dd5_row22_col4, #T_56dd5_row22_col5, #T_56dd5_row22_col6, #T_56dd5_row22_col7, #T_56dd5_row22_col8, #T_56dd5_row23_col0, #T_56dd5_row23_col1, #T_56dd5_row23_col2, #T_56dd5_row23_col3, #T_56dd5_row23_col4, #T_56dd5_row23_col5, #T_56dd5_row23_col6, #T_56dd5_row23_col7, #T_56dd5_row23_col8, #T_56dd5_row24_col0, #T_56dd5_row24_col1, #T_56dd5_row24_col2, #T_56dd5_row24_col3, #T_56dd5_row24_col4, #T_56dd5_row24_col5, #T_56dd5_row24_col6, #T_56dd5_row24_col7, #T_56dd5_row24_col8, #T_56dd5_row25_col0, #T_56dd5_row25_col1, #T_56dd5_row25_col2, #T_56dd5_row25_col3, #T_56dd5_row25_col4, #T_56dd5_row25_col5, #T_56dd5_row25_col6, #T_56dd5_row25_col7, #T_56dd5_row25_col8, #T_56dd5_row26_col0, #T_56dd5_row26_col1, #T_56dd5_row26_col2, #T_56dd5_row26_col3, #T_56dd5_row26_col4, #T_56dd5_row26_col5, #T_56dd5_row26_col6, #T_56dd5_row26_col7, #T_56dd5_row26_col8, #T_56dd5_row27_col0, #T_56dd5_row27_col1, #T_56dd5_row27_col2, #T_56dd5_row27_col3, #T_56dd5_row27_col4, #T_56dd5_row27_col5, #T_56dd5_row27_col6, #T_56dd5_row27_col7, #T_56dd5_row27_col8, #T_56dd5_row28_col0, #T_56dd5_row28_col1, #T_56dd5_row28_col2, #T_56dd5_row28_col3, #T_56dd5_row28_col4, #T_56dd5_row28_col5, #T_56dd5_row28_col6, #T_56dd5_row28_col7, #T_56dd5_row28_col8, #T_56dd5_row29_col0, #T_56dd5_row29_col1, #T_56dd5_row29_col2, #T_56dd5_row29_col3, #T_56dd5_row29_col4, #T_56dd5_row29_col5, #T_56dd5_row29_col6, #T_56dd5_row29_col7, #T_56dd5_row29_col8, #T_56dd5_row30_col0, #T_56dd5_row30_col1, #T_56dd5_row30_col2, #T_56dd5_row30_col3, #T_56dd5_row30_col4, #T_56dd5_row30_col5, #T_56dd5_row30_col6, #T_56dd5_row30_col7, #T_56dd5_row30_col8, #T_56dd5_row31_col0, #T_56dd5_row31_col1, #T_56dd5_row31_col2, #T_56dd5_row31_col3, #T_56dd5_row31_col4, #T_56dd5_row31_col5, #T_56dd5_row31_col6, #T_56dd5_row31_col7, #T_56dd5_row31_col8, #T_56dd5_row32_col0, #T_56dd5_row32_col1, #T_56dd5_row32_col2, #T_56dd5_row32_col3, #T_56dd5_row32_col4, #T_56dd5_row32_col5, #T_56dd5_row32_col6, #T_56dd5_row32_col7, #T_56dd5_row32_col8, #T_56dd5_row33_col0, #T_56dd5_row33_col1, #T_56dd5_row33_col2, #T_56dd5_row33_col3, #T_56dd5_row33_col4, #T_56dd5_row33_col5, #T_56dd5_row33_col6, #T_56dd5_row33_col7, #T_56dd5_row33_col8, #T_56dd5_row34_col0, #T_56dd5_row34_col1, #T_56dd5_row34_col2, #T_56dd5_row34_col3, #T_56dd5_row34_col4, #T_56dd5_row34_col5, #T_56dd5_row34_col6, #T_56dd5_row34_col7, #T_56dd5_row34_col8, #T_56dd5_row35_col0, #T_56dd5_row35_col1, #T_56dd5_row35_col2, #T_56dd5_row35_col3, #T_56dd5_row35_col4, #T_56dd5_row35_col5, #T_56dd5_row35_col6, #T_56dd5_row35_col7, #T_56dd5_row35_col8, #T_56dd5_row36_col0, #T_56dd5_row36_col1, #T_56dd5_row36_col2, #T_56dd5_row36_col3, #T_56dd5_row36_col4, #T_56dd5_row36_col5, #T_56dd5_row36_col6, #T_56dd5_row36_col7, #T_56dd5_row36_col8, #T_56dd5_row37_col0, #T_56dd5_row37_col1, #T_56dd5_row37_col2, #T_56dd5_row37_col3, #T_56dd5_row37_col4, #T_56dd5_row37_col5, #T_56dd5_row37_col6, #T_56dd5_row37_col7, #T_56dd5_row37_col8, #T_56dd5_row38_col0, #T_56dd5_row38_col1, #T_56dd5_row38_col2, #T_56dd5_row38_col3, #T_56dd5_row38_col4, #T_56dd5_row38_col5, #T_56dd5_row38_col6, #T_56dd5_row38_col7, #T_56dd5_row38_col8, #T_56dd5_row39_col0, #T_56dd5_row39_col1, #T_56dd5_row39_col2, #T_56dd5_row39_col3, #T_56dd5_row39_col4, #T_56dd5_row39_col5, #T_56dd5_row39_col6, #T_56dd5_row39_col7, #T_56dd5_row39_col8, #T_56dd5_row40_col0, #T_56dd5_row40_col1, #T_56dd5_row40_col2, #T_56dd5_row40_col3, #T_56dd5_row40_col4, #T_56dd5_row40_col5, #T_56dd5_row40_col6, #T_56dd5_row40_col7, #T_56dd5_row40_col8, #T_56dd5_row41_col0, #T_56dd5_row41_col1, #T_56dd5_row41_col2, #T_56dd5_row41_col3, #T_56dd5_row41_col4, #T_56dd5_row41_col5, #T_56dd5_row41_col6, #T_56dd5_row41_col7, #T_56dd5_row41_col8, #T_56dd5_row42_col0, #T_56dd5_row42_col1, #T_56dd5_row42_col2, #T_56dd5_row42_col3, #T_56dd5_row42_col4, #T_56dd5_row42_col5, #T_56dd5_row42_col6, #T_56dd5_row42_col7, #T_56dd5_row42_col8, #T_56dd5_row43_col0, #T_56dd5_row43_col1, #T_56dd5_row43_col2, #T_56dd5_row43_col3, #T_56dd5_row43_col4, #T_56dd5_row43_col5, #T_56dd5_row43_col6, #T_56dd5_row43_col7, #T_56dd5_row43_col8, #T_56dd5_row44_col0, #T_56dd5_row44_col1, #T_56dd5_row44_col2, #T_56dd5_row44_col3, #T_56dd5_row44_col4, #T_56dd5_row44_col5, #T_56dd5_row44_col6, #T_56dd5_row44_col7, #T_56dd5_row44_col8, #T_56dd5_row45_col0, #T_56dd5_row45_col1, #T_56dd5_row45_col2, #T_56dd5_row45_col3, #T_56dd5_row45_col4, #T_56dd5_row45_col5, #T_56dd5_row45_col6, #T_56dd5_row45_col7, #T_56dd5_row45_col8, #T_56dd5_row46_col0, #T_56dd5_row46_col1, #T_56dd5_row46_col2, #T_56dd5_row46_col3, #T_56dd5_row46_col4, #T_56dd5_row46_col5, #T_56dd5_row46_col6, #T_56dd5_row46_col7, #T_56dd5_row46_col8, #T_56dd5_row47_col0, #T_56dd5_row47_col1, #T_56dd5_row47_col2, #T_56dd5_row47_col3, #T_56dd5_row47_col4, #T_56dd5_row47_col5, #T_56dd5_row47_col6, #T_56dd5_row47_col7, #T_56dd5_row47_col8, #T_56dd5_row48_col0, #T_56dd5_row48_col1, #T_56dd5_row48_col2, #T_56dd5_row48_col3, #T_56dd5_row48_col4, #T_56dd5_row48_col5, #T_56dd5_row48_col6, #T_56dd5_row48_col7, #T_56dd5_row48_col8, #T_56dd5_row49_col0, #T_56dd5_row49_col1, #T_56dd5_row49_col2, #T_56dd5_row49_col3, #T_56dd5_row49_col4, #T_56dd5_row49_col5, #T_56dd5_row49_col6, #T_56dd5_row49_col7, #T_56dd5_row49_col8, #T_56dd5_row50_col0, #T_56dd5_row50_col1, #T_56dd5_row50_col2, #T_56dd5_row50_col3, #T_56dd5_row50_col4, #T_56dd5_row50_col5, #T_56dd5_row50_col6, #T_56dd5_row50_col7, #T_56dd5_row50_col8, #T_56dd5_row51_col0, #T_56dd5_row51_col1, #T_56dd5_row51_col2, #T_56dd5_row51_col3, #T_56dd5_row51_col4, #T_56dd5_row51_col5, #T_56dd5_row51_col6, #T_56dd5_row51_col7, #T_56dd5_row51_col8, #T_56dd5_row52_col0, #T_56dd5_row52_col1, #T_56dd5_row52_col2, #T_56dd5_row52_col3, #T_56dd5_row52_col4, #T_56dd5_row52_col5, #T_56dd5_row52_col6, #T_56dd5_row52_col7, #T_56dd5_row52_col8, #T_56dd5_row53_col0, #T_56dd5_row53_col1, #T_56dd5_row53_col2, #T_56dd5_row53_col3, #T_56dd5_row53_col4, #T_56dd5_row53_col5, #T_56dd5_row53_col6, #T_56dd5_row53_col7, #T_56dd5_row53_col8, #T_56dd5_row54_col0, #T_56dd5_row54_col1, #T_56dd5_row54_col2, #T_56dd5_row54_col3, #T_56dd5_row54_col4, #T_56dd5_row54_col5, #T_56dd5_row54_col6, #T_56dd5_row54_col7, #T_56dd5_row54_col8, #T_56dd5_row55_col0, #T_56dd5_row55_col1, #T_56dd5_row55_col2, #T_56dd5_row55_col3, #T_56dd5_row55_col4, #T_56dd5_row55_col5, #T_56dd5_row55_col6, #T_56dd5_row55_col7, #T_56dd5_row55_col8, #T_56dd5_row56_col0, #T_56dd5_row56_col1, #T_56dd5_row56_col2, #T_56dd5_row56_col3, #T_56dd5_row56_col4, #T_56dd5_row56_col5, #T_56dd5_row56_col6, #T_56dd5_row56_col7, #T_56dd5_row56_col8, #T_56dd5_row57_col0, #T_56dd5_row57_col1, #T_56dd5_row57_col2, #T_56dd5_row57_col3, #T_56dd5_row57_col4, #T_56dd5_row57_col5, #T_56dd5_row57_col6, #T_56dd5_row57_col7, #T_56dd5_row57_col8, #T_56dd5_row58_col0, #T_56dd5_row58_col1, #T_56dd5_row58_col2, #T_56dd5_row58_col3, #T_56dd5_row58_col4, #T_56dd5_row58_col5, #T_56dd5_row58_col6, #T_56dd5_row58_col7, #T_56dd5_row58_col8, #T_56dd5_row59_col0, #T_56dd5_row59_col1, #T_56dd5_row59_col2, #T_56dd5_row59_col3, #T_56dd5_row59_col4, #T_56dd5_row59_col5, #T_56dd5_row59_col6, #T_56dd5_row59_col7, #T_56dd5_row59_col8, #T_56dd5_row60_col0, #T_56dd5_row60_col1, #T_56dd5_row60_col2, #T_56dd5_row60_col3, #T_56dd5_row60_col4, #T_56dd5_row60_col5, #T_56dd5_row60_col6, #T_56dd5_row60_col7, #T_56dd5_row60_col8, #T_56dd5_row61_col0, #T_56dd5_row61_col1, #T_56dd5_row61_col2, #T_56dd5_row61_col3, #T_56dd5_row61_col4, #T_56dd5_row61_col5, #T_56dd5_row61_col6, #T_56dd5_row61_col7, #T_56dd5_row61_col8, #T_56dd5_row62_col0, #T_56dd5_row62_col1, #T_56dd5_row62_col2, #T_56dd5_row62_col3, #T_56dd5_row62_col4, #T_56dd5_row62_col5, #T_56dd5_row62_col6, #T_56dd5_row62_col7, #T_56dd5_row62_col8, #T_56dd5_row63_col0, #T_56dd5_row63_col1, #T_56dd5_row63_col2, #T_56dd5_row63_col3, #T_56dd5_row63_col4, #T_56dd5_row63_col5, #T_56dd5_row63_col6, #T_56dd5_row63_col7, #T_56dd5_row63_col8, #T_56dd5_row64_col0, #T_56dd5_row64_col1, #T_56dd5_row64_col2, #T_56dd5_row64_col3, #T_56dd5_row64_col4, #T_56dd5_row64_col5, #T_56dd5_row64_col6, #T_56dd5_row64_col7, #T_56dd5_row64_col8, #T_56dd5_row65_col0, #T_56dd5_row65_col1, #T_56dd5_row65_col2, #T_56dd5_row65_col3, #T_56dd5_row65_col4, #T_56dd5_row65_col5, #T_56dd5_row65_col6, #T_56dd5_row65_col7, #T_56dd5_row65_col8, #T_56dd5_row66_col0, #T_56dd5_row66_col1, #T_56dd5_row66_col2, #T_56dd5_row66_col3, #T_56dd5_row66_col4, #T_56dd5_row66_col5, #T_56dd5_row66_col6, #T_56dd5_row66_col7, #T_56dd5_row66_col8, #T_56dd5_row67_col0, #T_56dd5_row67_col1, #T_56dd5_row67_col2, #T_56dd5_row67_col3, #T_56dd5_row67_col4, #T_56dd5_row67_col5, #T_56dd5_row67_col6, #T_56dd5_row67_col7, #T_56dd5_row67_col8, #T_56dd5_row68_col0, #T_56dd5_row68_col1, #T_56dd5_row68_col2, #T_56dd5_row68_col3, #T_56dd5_row68_col4, #T_56dd5_row68_col5, #T_56dd5_row68_col6, #T_56dd5_row68_col7, #T_56dd5_row68_col8, #T_56dd5_row69_col0, #T_56dd5_row69_col1, #T_56dd5_row69_col2, #T_56dd5_row69_col3, #T_56dd5_row69_col4, #T_56dd5_row69_col5, #T_56dd5_row69_col6, #T_56dd5_row69_col7, #T_56dd5_row69_col8, #T_56dd5_row70_col0, #T_56dd5_row70_col1, #T_56dd5_row70_col2, #T_56dd5_row70_col3, #T_56dd5_row70_col4, #T_56dd5_row70_col5, #T_56dd5_row70_col6, #T_56dd5_row70_col7, #T_56dd5_row70_col8, #T_56dd5_row71_col0, #T_56dd5_row71_col1, #T_56dd5_row71_col2, #T_56dd5_row71_col3, #T_56dd5_row71_col4, #T_56dd5_row71_col5, #T_56dd5_row71_col6, #T_56dd5_row71_col7, #T_56dd5_row71_col8, #T_56dd5_row72_col0, #T_56dd5_row72_col1, #T_56dd5_row72_col2, #T_56dd5_row72_col3, #T_56dd5_row72_col4, #T_56dd5_row72_col5, #T_56dd5_row72_col6, #T_56dd5_row72_col7, #T_56dd5_row72_col8, #T_56dd5_row73_col0, #T_56dd5_row73_col1, #T_56dd5_row73_col2, #T_56dd5_row73_col3, #T_56dd5_row73_col4, #T_56dd5_row73_col5, #T_56dd5_row73_col6, #T_56dd5_row73_col7, #T_56dd5_row73_col8, #T_56dd5_row74_col0, #T_56dd5_row74_col1, #T_56dd5_row74_col2, #T_56dd5_row74_col3, #T_56dd5_row74_col4, #T_56dd5_row74_col5, #T_56dd5_row74_col6, #T_56dd5_row74_col7, #T_56dd5_row74_col8, #T_56dd5_row75_col0, #T_56dd5_row75_col1, #T_56dd5_row75_col2, #T_56dd5_row75_col3, #T_56dd5_row75_col4, #T_56dd5_row75_col5, #T_56dd5_row75_col6, #T_56dd5_row75_col7, #T_56dd5_row75_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_56dd5\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_56dd5_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_56dd5_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_56dd5_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_56dd5_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_56dd5_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_56dd5_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_56dd5_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_56dd5_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_56dd5_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_56dd5_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", + " <td id=\"T_56dd5_row0_col1\" class=\"data row0 col1\" >Bivariate Scatter Plots</td>\n", + " <td id=\"T_56dd5_row0_col2\" class=\"data row0 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", + " <td id=\"T_56dd5_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_56dd5_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_56dd5_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row0_col7\" class=\"data row0 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row0_col8\" class=\"data row0 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", + " <td id=\"T_56dd5_row1_col1\" class=\"data row1 col1\" >Chi Squared Features Table</td>\n", + " <td id=\"T_56dd5_row1_col2\" class=\"data row1 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", + " <td id=\"T_56dd5_row1_col3\" class=\"data row1 col3\" >False</td>\n", + " <td id=\"T_56dd5_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_56dd5_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row1_col6\" class=\"data row1 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", + " <td id=\"T_56dd5_row1_col7\" class=\"data row1 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", + " <td id=\"T_56dd5_row1_col8\" class=\"data row1 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_56dd5_row2_col1\" class=\"data row2 col1\" >Class Imbalance</td>\n", + " <td id=\"T_56dd5_row2_col2\" class=\"data row2 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", + " <td id=\"T_56dd5_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_56dd5_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_56dd5_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row2_col6\" class=\"data row2 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row2_col7\" class=\"data row2 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row2_col8\" class=\"data row2 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_56dd5_row3_col1\" class=\"data row3 col1\" >Dataset Description</td>\n", + " <td id=\"T_56dd5_row3_col2\" class=\"data row3 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", + " <td id=\"T_56dd5_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_56dd5_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_56dd5_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row3_col7\" class=\"data row3 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_56dd5_row3_col8\" class=\"data row3 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_56dd5_row4_col1\" class=\"data row4 col1\" >Dataset Split</td>\n", + " <td id=\"T_56dd5_row4_col2\" class=\"data row4 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", + " <td id=\"T_56dd5_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_56dd5_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_56dd5_row4_col5\" class=\"data row4 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row4_col6\" class=\"data row4 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row4_col7\" class=\"data row4 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_56dd5_row4_col8\" class=\"data row4 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_56dd5_row5_col1\" class=\"data row5 col1\" >Descriptive Statistics</td>\n", + " <td id=\"T_56dd5_row5_col2\" class=\"data row5 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", + " <td id=\"T_56dd5_row5_col3\" class=\"data row5 col3\" >False</td>\n", + " <td id=\"T_56dd5_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_56dd5_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row5_col8\" class=\"data row5 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_56dd5_row6_col1\" class=\"data row6 col1\" >Duplicates</td>\n", + " <td id=\"T_56dd5_row6_col2\" class=\"data row6 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", + " <td id=\"T_56dd5_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_56dd5_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_56dd5_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row6_col6\" class=\"data row6 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row6_col7\" class=\"data row6 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", + " <td id=\"T_56dd5_row6_col8\" class=\"data row6 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", + " <td id=\"T_56dd5_row7_col1\" class=\"data row7 col1\" >Feature Target Correlation Plot</td>\n", + " <td id=\"T_56dd5_row7_col2\" class=\"data row7 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", + " <td id=\"T_56dd5_row7_col3\" class=\"data row7 col3\" >True</td>\n", + " <td id=\"T_56dd5_row7_col4\" class=\"data row7 col4\" >False</td>\n", + " <td id=\"T_56dd5_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row7_col6\" class=\"data row7 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", + " <td id=\"T_56dd5_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", + " <td id=\"T_56dd5_row7_col8\" class=\"data row7 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_56dd5_row8_col1\" class=\"data row8 col1\" >High Cardinality</td>\n", + " <td id=\"T_56dd5_row8_col2\" class=\"data row8 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", + " <td id=\"T_56dd5_row8_col3\" class=\"data row8 col3\" >False</td>\n", + " <td id=\"T_56dd5_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_56dd5_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row8_col6\" class=\"data row8 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", + " <td id=\"T_56dd5_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row8_col8\" class=\"data row8 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_56dd5_row9_col1\" class=\"data row9 col1\" >High Pearson Correlation</td>\n", + " <td id=\"T_56dd5_row9_col2\" class=\"data row9 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", + " <td id=\"T_56dd5_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_56dd5_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_56dd5_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row9_col6\" class=\"data row9 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", + " <td id=\"T_56dd5_row9_col8\" class=\"data row9 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", + " <td id=\"T_56dd5_row10_col1\" class=\"data row10 col1\" >IQR Outliers Bar Plot</td>\n", + " <td id=\"T_56dd5_row10_col2\" class=\"data row10 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", + " <td id=\"T_56dd5_row10_col3\" class=\"data row10 col3\" >True</td>\n", + " <td id=\"T_56dd5_row10_col4\" class=\"data row10 col4\" >False</td>\n", + " <td id=\"T_56dd5_row10_col5\" class=\"data row10 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", + " <td id=\"T_56dd5_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", + " <td id=\"T_56dd5_row10_col8\" class=\"data row10 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.IQROutliersTable</td>\n", + " <td id=\"T_56dd5_row11_col1\" class=\"data row11 col1\" >IQR Outliers Table</td>\n", + " <td id=\"T_56dd5_row11_col2\" class=\"data row11 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", + " <td id=\"T_56dd5_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_56dd5_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_56dd5_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row11_col6\" class=\"data row11 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", + " <td id=\"T_56dd5_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'numerical_data']</td>\n", + " <td id=\"T_56dd5_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", + " <td id=\"T_56dd5_row12_col1\" class=\"data row12 col1\" >Isolation Forest Outliers</td>\n", + " <td id=\"T_56dd5_row12_col2\" class=\"data row12 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", + " <td id=\"T_56dd5_row12_col3\" class=\"data row12 col3\" >True</td>\n", + " <td id=\"T_56dd5_row12_col4\" class=\"data row12 col4\" >False</td>\n", + " <td id=\"T_56dd5_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row12_col6\" class=\"data row12 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row12_col7\" class=\"data row12 col7\" >['tabular_data', 'anomaly_detection']</td>\n", + " <td id=\"T_56dd5_row12_col8\" class=\"data row12 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.JarqueBera</td>\n", + " <td id=\"T_56dd5_row13_col1\" class=\"data row13 col1\" >Jarque Bera</td>\n", + " <td id=\"T_56dd5_row13_col2\" class=\"data row13 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", + " <td id=\"T_56dd5_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_56dd5_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_56dd5_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row13_col6\" class=\"data row13 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_56dd5_row14_col1\" class=\"data row14 col1\" >Missing Values</td>\n", + " <td id=\"T_56dd5_row14_col2\" class=\"data row14 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", + " <td id=\"T_56dd5_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_56dd5_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_56dd5_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row14_col6\" class=\"data row14 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row14_col7\" class=\"data row14 col7\" >['tabular_data', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row14_col8\" class=\"data row14 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", + " <td id=\"T_56dd5_row15_col1\" class=\"data row15 col1\" >Missing Values Bar Plot</td>\n", + " <td id=\"T_56dd5_row15_col2\" class=\"data row15 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", + " <td id=\"T_56dd5_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_56dd5_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_56dd5_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row15_col6\" class=\"data row15 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", + " <td id=\"T_56dd5_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", + " <td id=\"T_56dd5_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.MutualInformation</td>\n", + " <td id=\"T_56dd5_row16_col1\" class=\"data row16 col1\" >Mutual Information</td>\n", + " <td id=\"T_56dd5_row16_col2\" class=\"data row16 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", + " <td id=\"T_56dd5_row16_col3\" class=\"data row16 col3\" >True</td>\n", + " <td id=\"T_56dd5_row16_col4\" class=\"data row16 col4\" >False</td>\n", + " <td id=\"T_56dd5_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", + " <td id=\"T_56dd5_row16_col7\" class=\"data row16 col7\" >['feature_selection', 'data_analysis']</td>\n", + " <td id=\"T_56dd5_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_56dd5_row17_col1\" class=\"data row17 col1\" >Pearson Correlation Matrix</td>\n", + " <td id=\"T_56dd5_row17_col2\" class=\"data row17 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", + " <td id=\"T_56dd5_row17_col3\" class=\"data row17 col3\" >True</td>\n", + " <td id=\"T_56dd5_row17_col4\" class=\"data row17 col4\" >False</td>\n", + " <td id=\"T_56dd5_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row17_col6\" class=\"data row17 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", + " <td id=\"T_56dd5_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", + " <td id=\"T_56dd5_row18_col1\" class=\"data row18 col1\" >Protected Classes Description</td>\n", + " <td id=\"T_56dd5_row18_col2\" class=\"data row18 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", + " <td id=\"T_56dd5_row18_col3\" class=\"data row18 col3\" >True</td>\n", + " <td id=\"T_56dd5_row18_col4\" class=\"data row18 col4\" >True</td>\n", + " <td id=\"T_56dd5_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row18_col6\" class=\"data row18 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row18_col7\" class=\"data row18 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", + " <td id=\"T_56dd5_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.RunsTest</td>\n", + " <td id=\"T_56dd5_row19_col1\" class=\"data row19 col1\" >Runs Test</td>\n", + " <td id=\"T_56dd5_row19_col2\" class=\"data row19 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", + " <td id=\"T_56dd5_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_56dd5_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_56dd5_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row19_col6\" class=\"data row19 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.ScatterPlot</td>\n", + " <td id=\"T_56dd5_row20_col1\" class=\"data row20 col1\" >Scatter Plot</td>\n", + " <td id=\"T_56dd5_row20_col2\" class=\"data row20 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", + " <td id=\"T_56dd5_row20_col3\" class=\"data row20 col3\" >True</td>\n", + " <td id=\"T_56dd5_row20_col4\" class=\"data row20 col4\" >False</td>\n", + " <td id=\"T_56dd5_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row20_col6\" class=\"data row20 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row20_col8\" class=\"data row20 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", + " <td id=\"T_56dd5_row21_col1\" class=\"data row21 col1\" >Score Band Default Rates</td>\n", + " <td id=\"T_56dd5_row21_col2\" class=\"data row21 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", + " <td id=\"T_56dd5_row21_col3\" class=\"data row21 col3\" >False</td>\n", + " <td id=\"T_56dd5_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_56dd5_row21_col5\" class=\"data row21 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row21_col6\" class=\"data row21 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row21_col7\" class=\"data row21 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_56dd5_row21_col8\" class=\"data row21 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.ShapiroWilk</td>\n", + " <td id=\"T_56dd5_row22_col1\" class=\"data row22 col1\" >Shapiro Wilk</td>\n", + " <td id=\"T_56dd5_row22_col2\" class=\"data row22 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", + " <td id=\"T_56dd5_row22_col3\" class=\"data row22 col3\" >False</td>\n", + " <td id=\"T_56dd5_row22_col4\" class=\"data row22 col4\" >True</td>\n", + " <td id=\"T_56dd5_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row22_col6\" class=\"data row22 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row22_col7\" class=\"data row22 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", + " <td id=\"T_56dd5_row22_col8\" class=\"data row22 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_56dd5_row23_col1\" class=\"data row23 col1\" >Skewness</td>\n", + " <td id=\"T_56dd5_row23_col2\" class=\"data row23 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", + " <td id=\"T_56dd5_row23_col3\" class=\"data row23 col3\" >False</td>\n", + " <td id=\"T_56dd5_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_56dd5_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row23_col6\" class=\"data row23 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row23_col7\" class=\"data row23 col7\" >['data_quality', 'tabular_data']</td>\n", + " <td id=\"T_56dd5_row23_col8\" class=\"data row23 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", + " <td id=\"T_56dd5_row24_col1\" class=\"data row24 col1\" >Tabular Categorical Bar Plots</td>\n", + " <td id=\"T_56dd5_row24_col2\" class=\"data row24 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", + " <td id=\"T_56dd5_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_56dd5_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_56dd5_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row24_col6\" class=\"data row24 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row24_col7\" class=\"data row24 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row24_col8\" class=\"data row24 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", + " <td id=\"T_56dd5_row25_col1\" class=\"data row25 col1\" >Tabular Date Time Histograms</td>\n", + " <td id=\"T_56dd5_row25_col2\" class=\"data row25 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", + " <td id=\"T_56dd5_row25_col3\" class=\"data row25 col3\" >True</td>\n", + " <td id=\"T_56dd5_row25_col4\" class=\"data row25 col4\" >False</td>\n", + " <td id=\"T_56dd5_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row25_col6\" class=\"data row25 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row25_col7\" class=\"data row25 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", + " <td id=\"T_56dd5_row26_col1\" class=\"data row26 col1\" >Tabular Description Tables</td>\n", + " <td id=\"T_56dd5_row26_col2\" class=\"data row26 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", + " <td id=\"T_56dd5_row26_col3\" class=\"data row26 col3\" >False</td>\n", + " <td id=\"T_56dd5_row26_col4\" class=\"data row26 col4\" >True</td>\n", + " <td id=\"T_56dd5_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row26_col6\" class=\"data row26 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row26_col7\" class=\"data row26 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", + " <td id=\"T_56dd5_row27_col1\" class=\"data row27 col1\" >Tabular Numerical Histograms</td>\n", + " <td id=\"T_56dd5_row27_col2\" class=\"data row27 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", + " <td id=\"T_56dd5_row27_col3\" class=\"data row27 col3\" >True</td>\n", + " <td id=\"T_56dd5_row27_col4\" class=\"data row27 col4\" >False</td>\n", + " <td id=\"T_56dd5_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row27_col6\" class=\"data row27 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row27_col7\" class=\"data row27 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", + " <td id=\"T_56dd5_row28_col1\" class=\"data row28 col1\" >Target Rate Bar Plots</td>\n", + " <td id=\"T_56dd5_row28_col2\" class=\"data row28 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", + " <td id=\"T_56dd5_row28_col3\" class=\"data row28 col3\" >True</td>\n", + " <td id=\"T_56dd5_row28_col4\" class=\"data row28 col4\" >False</td>\n", + " <td id=\"T_56dd5_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row28_col8\" class=\"data row28 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_56dd5_row29_col1\" class=\"data row29 col1\" >Too Many Zero Values</td>\n", + " <td id=\"T_56dd5_row29_col2\" class=\"data row29 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", + " <td id=\"T_56dd5_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_56dd5_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_56dd5_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row29_col6\" class=\"data row29 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", + " <td id=\"T_56dd5_row29_col7\" class=\"data row29 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row29_col8\" class=\"data row29 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_56dd5_row30_col1\" class=\"data row30 col1\" >Unique Rows</td>\n", + " <td id=\"T_56dd5_row30_col2\" class=\"data row30 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", + " <td id=\"T_56dd5_row30_col3\" class=\"data row30 col3\" >False</td>\n", + " <td id=\"T_56dd5_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_56dd5_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row30_col6\" class=\"data row30 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row30_col7\" class=\"data row30 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row30_col8\" class=\"data row30 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.WOEBinPlots</td>\n", + " <td id=\"T_56dd5_row31_col1\" class=\"data row31 col1\" >WOE Bin Plots</td>\n", + " <td id=\"T_56dd5_row31_col2\" class=\"data row31 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", + " <td id=\"T_56dd5_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_56dd5_row31_col4\" class=\"data row31 col4\" >False</td>\n", + " <td id=\"T_56dd5_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row31_col6\" class=\"data row31 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_56dd5_row31_col7\" class=\"data row31 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row31_col8\" class=\"data row31 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.WOEBinTable</td>\n", + " <td id=\"T_56dd5_row32_col1\" class=\"data row32 col1\" >WOE Bin Table</td>\n", + " <td id=\"T_56dd5_row32_col2\" class=\"data row32 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", + " <td id=\"T_56dd5_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_56dd5_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_56dd5_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row32_col6\" class=\"data row32 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row32_col8\" class=\"data row32 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.FeaturesAUC</td>\n", + " <td id=\"T_56dd5_row33_col1\" class=\"data row33 col1\" >Features AUC</td>\n", + " <td id=\"T_56dd5_row33_col2\" class=\"data row33 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", + " <td id=\"T_56dd5_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_56dd5_row33_col4\" class=\"data row33 col4\" >False</td>\n", + " <td id=\"T_56dd5_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row33_col6\" class=\"data row33 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_56dd5_row33_col7\" class=\"data row33 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", + " <td id=\"T_56dd5_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_56dd5_row34_col1\" class=\"data row34 col1\" >Calibration Curve</td>\n", + " <td id=\"T_56dd5_row34_col2\" class=\"data row34 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_56dd5_row34_col3\" class=\"data row34 col3\" >True</td>\n", + " <td id=\"T_56dd5_row34_col4\" class=\"data row34 col4\" >False</td>\n", + " <td id=\"T_56dd5_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row34_col6\" class=\"data row34 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_56dd5_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_56dd5_row35_col1\" class=\"data row35 col1\" >Classifier Performance</td>\n", + " <td id=\"T_56dd5_row35_col2\" class=\"data row35 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_56dd5_row35_col3\" class=\"data row35 col3\" >False</td>\n", + " <td id=\"T_56dd5_row35_col4\" class=\"data row35 col4\" >True</td>\n", + " <td id=\"T_56dd5_row35_col5\" class=\"data row35 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row35_col6\" class=\"data row35 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_56dd5_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_56dd5_row36_col1\" class=\"data row36 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_56dd5_row36_col2\" class=\"data row36 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_56dd5_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_56dd5_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_56dd5_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row36_col6\" class=\"data row36 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row36_col7\" class=\"data row36 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_56dd5_row36_col8\" class=\"data row36 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_56dd5_row37_col1\" class=\"data row37 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_56dd5_row37_col2\" class=\"data row37 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_56dd5_row37_col3\" class=\"data row37 col3\" >True</td>\n", + " <td id=\"T_56dd5_row37_col4\" class=\"data row37 col4\" >False</td>\n", + " <td id=\"T_56dd5_row37_col5\" class=\"data row37 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row37_col6\" class=\"data row37 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row38_col0\" class=\"data row38 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_56dd5_row38_col1\" class=\"data row38 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_56dd5_row38_col2\" class=\"data row38 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_56dd5_row38_col3\" class=\"data row38 col3\" >False</td>\n", + " <td id=\"T_56dd5_row38_col4\" class=\"data row38 col4\" >True</td>\n", + " <td id=\"T_56dd5_row38_col5\" class=\"data row38 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row38_col6\" class=\"data row38 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row38_col7\" class=\"data row38 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row38_col8\" class=\"data row38 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row39_col0\" class=\"data row39 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_56dd5_row39_col1\" class=\"data row39 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_56dd5_row39_col2\" class=\"data row39 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_56dd5_row39_col3\" class=\"data row39 col3\" >False</td>\n", + " <td id=\"T_56dd5_row39_col4\" class=\"data row39 col4\" >True</td>\n", + " <td id=\"T_56dd5_row39_col5\" class=\"data row39 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row39_col6\" class=\"data row39 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_56dd5_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row40_col0\" class=\"data row40 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_56dd5_row40_col1\" class=\"data row40 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_56dd5_row40_col2\" class=\"data row40 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_56dd5_row40_col3\" class=\"data row40 col3\" >False</td>\n", + " <td id=\"T_56dd5_row40_col4\" class=\"data row40 col4\" >True</td>\n", + " <td id=\"T_56dd5_row40_col5\" class=\"data row40 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row40_col6\" class=\"data row40 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row41_col0\" class=\"data row41 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_56dd5_row41_col1\" class=\"data row41 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_56dd5_row41_col2\" class=\"data row41 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_56dd5_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_56dd5_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_56dd5_row41_col5\" class=\"data row41 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row41_col6\" class=\"data row41 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row42_col0\" class=\"data row42 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_56dd5_row42_col1\" class=\"data row42 col1\" >Model Parameters</td>\n", + " <td id=\"T_56dd5_row42_col2\" class=\"data row42 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_56dd5_row42_col3\" class=\"data row42 col3\" >False</td>\n", + " <td id=\"T_56dd5_row42_col4\" class=\"data row42 col4\" >True</td>\n", + " <td id=\"T_56dd5_row42_col5\" class=\"data row42 col5\" >['model']</td>\n", + " <td id=\"T_56dd5_row42_col6\" class=\"data row42 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row42_col7\" class=\"data row42 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_56dd5_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row43_col0\" class=\"data row43 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_56dd5_row43_col1\" class=\"data row43 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_56dd5_row43_col2\" class=\"data row43 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_56dd5_row43_col3\" class=\"data row43 col3\" >False</td>\n", + " <td id=\"T_56dd5_row43_col4\" class=\"data row43 col4\" >True</td>\n", + " <td id=\"T_56dd5_row43_col5\" class=\"data row43 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_56dd5_row43_col6\" class=\"data row43 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row43_col7\" class=\"data row43 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_56dd5_row43_col8\" class=\"data row43 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row44_col0\" class=\"data row44 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_56dd5_row44_col1\" class=\"data row44 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_56dd5_row44_col2\" class=\"data row44 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_56dd5_row44_col3\" class=\"data row44 col3\" >True</td>\n", + " <td id=\"T_56dd5_row44_col4\" class=\"data row44 col4\" >True</td>\n", + " <td id=\"T_56dd5_row44_col5\" class=\"data row44 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_56dd5_row44_col6\" class=\"data row44 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_56dd5_row44_col7\" class=\"data row44 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_56dd5_row44_col8\" class=\"data row44 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row45_col0\" class=\"data row45 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_56dd5_row45_col1\" class=\"data row45 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_56dd5_row45_col2\" class=\"data row45 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_56dd5_row45_col3\" class=\"data row45 col3\" >True</td>\n", + " <td id=\"T_56dd5_row45_col4\" class=\"data row45 col4\" >False</td>\n", + " <td id=\"T_56dd5_row45_col5\" class=\"data row45 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row45_col6\" class=\"data row45 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row45_col7\" class=\"data row45 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row45_col8\" class=\"data row45 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row46_col0\" class=\"data row46 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_56dd5_row46_col1\" class=\"data row46 col1\" >Population Stability Index</td>\n", + " <td id=\"T_56dd5_row46_col2\" class=\"data row46 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_56dd5_row46_col3\" class=\"data row46 col3\" >True</td>\n", + " <td id=\"T_56dd5_row46_col4\" class=\"data row46 col4\" >True</td>\n", + " <td id=\"T_56dd5_row46_col5\" class=\"data row46 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row46_col6\" class=\"data row46 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_56dd5_row46_col7\" class=\"data row46 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row46_col8\" class=\"data row46 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row47_col0\" class=\"data row47 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_56dd5_row47_col1\" class=\"data row47 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_56dd5_row47_col2\" class=\"data row47 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_56dd5_row47_col3\" class=\"data row47 col3\" >True</td>\n", + " <td id=\"T_56dd5_row47_col4\" class=\"data row47 col4\" >False</td>\n", + " <td id=\"T_56dd5_row47_col5\" class=\"data row47 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row47_col6\" class=\"data row47 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row47_col7\" class=\"data row47 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row47_col8\" class=\"data row47 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row48_col0\" class=\"data row48 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_56dd5_row48_col1\" class=\"data row48 col1\" >ROC Curve</td>\n", + " <td id=\"T_56dd5_row48_col2\" class=\"data row48 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_56dd5_row48_col3\" class=\"data row48 col3\" >True</td>\n", + " <td id=\"T_56dd5_row48_col4\" class=\"data row48 col4\" >False</td>\n", + " <td id=\"T_56dd5_row48_col5\" class=\"data row48 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row48_col6\" class=\"data row48 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row48_col7\" class=\"data row48 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row48_col8\" class=\"data row48 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row49_col0\" class=\"data row49 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_56dd5_row49_col1\" class=\"data row49 col1\" >Regression Errors</td>\n", + " <td id=\"T_56dd5_row49_col2\" class=\"data row49 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_56dd5_row49_col3\" class=\"data row49 col3\" >False</td>\n", + " <td id=\"T_56dd5_row49_col4\" class=\"data row49 col4\" >True</td>\n", + " <td id=\"T_56dd5_row49_col5\" class=\"data row49 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row49_col6\" class=\"data row49 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row49_col7\" class=\"data row49 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row49_col8\" class=\"data row49 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row50_col0\" class=\"data row50 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_56dd5_row50_col1\" class=\"data row50 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_56dd5_row50_col2\" class=\"data row50 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_56dd5_row50_col3\" class=\"data row50 col3\" >True</td>\n", + " <td id=\"T_56dd5_row50_col4\" class=\"data row50 col4\" >True</td>\n", + " <td id=\"T_56dd5_row50_col5\" class=\"data row50 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row50_col6\" class=\"data row50 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_56dd5_row50_col7\" class=\"data row50 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_56dd5_row50_col8\" class=\"data row50 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row51_col0\" class=\"data row51 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_56dd5_row51_col1\" class=\"data row51 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_56dd5_row51_col2\" class=\"data row51 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_56dd5_row51_col3\" class=\"data row51 col3\" >False</td>\n", + " <td id=\"T_56dd5_row51_col4\" class=\"data row51 col4\" >True</td>\n", + " <td id=\"T_56dd5_row51_col5\" class=\"data row51 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row51_col6\" class=\"data row51 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row51_col7\" class=\"data row51 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row51_col8\" class=\"data row51 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row52_col0\" class=\"data row52 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_56dd5_row52_col1\" class=\"data row52 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_56dd5_row52_col2\" class=\"data row52 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_56dd5_row52_col3\" class=\"data row52 col3\" >True</td>\n", + " <td id=\"T_56dd5_row52_col4\" class=\"data row52 col4\" >True</td>\n", + " <td id=\"T_56dd5_row52_col5\" class=\"data row52 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row52_col6\" class=\"data row52 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row52_col7\" class=\"data row52 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_56dd5_row52_col8\" class=\"data row52 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row53_col0\" class=\"data row53 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_56dd5_row53_col1\" class=\"data row53 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_56dd5_row53_col2\" class=\"data row53 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_56dd5_row53_col3\" class=\"data row53 col3\" >False</td>\n", + " <td id=\"T_56dd5_row53_col4\" class=\"data row53 col4\" >True</td>\n", + " <td id=\"T_56dd5_row53_col5\" class=\"data row53 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row53_col6\" class=\"data row53 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_56dd5_row53_col7\" class=\"data row53 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row53_col8\" class=\"data row53 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row54_col0\" class=\"data row54 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_56dd5_row54_col1\" class=\"data row54 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_56dd5_row54_col2\" class=\"data row54 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_56dd5_row54_col3\" class=\"data row54 col3\" >True</td>\n", + " <td id=\"T_56dd5_row54_col4\" class=\"data row54 col4\" >True</td>\n", + " <td id=\"T_56dd5_row54_col5\" class=\"data row54 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row54_col6\" class=\"data row54 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row54_col7\" class=\"data row54 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_56dd5_row54_col8\" class=\"data row54 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row55_col0\" class=\"data row55 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", + " <td id=\"T_56dd5_row55_col1\" class=\"data row55 col1\" >Cumulative Prediction Probabilities</td>\n", + " <td id=\"T_56dd5_row55_col2\" class=\"data row55 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", + " <td id=\"T_56dd5_row55_col3\" class=\"data row55 col3\" >True</td>\n", + " <td id=\"T_56dd5_row55_col4\" class=\"data row55 col4\" >False</td>\n", + " <td id=\"T_56dd5_row55_col5\" class=\"data row55 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row55_col6\" class=\"data row55 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", + " <td id=\"T_56dd5_row55_col7\" class=\"data row55 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row55_col8\" class=\"data row55 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row56_col0\" class=\"data row56 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", + " <td id=\"T_56dd5_row56_col1\" class=\"data row56 col1\" >GINI Table</td>\n", + " <td id=\"T_56dd5_row56_col2\" class=\"data row56 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", + " <td id=\"T_56dd5_row56_col3\" class=\"data row56 col3\" >False</td>\n", + " <td id=\"T_56dd5_row56_col4\" class=\"data row56 col4\" >True</td>\n", + " <td id=\"T_56dd5_row56_col5\" class=\"data row56 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row56_col6\" class=\"data row56 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row56_col7\" class=\"data row56 col7\" >['model_performance']</td>\n", + " <td id=\"T_56dd5_row56_col8\" class=\"data row56 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row57_col0\" class=\"data row57 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", + " <td id=\"T_56dd5_row57_col1\" class=\"data row57 col1\" >Kolmogorov Smirnov</td>\n", + " <td id=\"T_56dd5_row57_col2\" class=\"data row57 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", + " <td id=\"T_56dd5_row57_col3\" class=\"data row57 col3\" >False</td>\n", + " <td id=\"T_56dd5_row57_col4\" class=\"data row57 col4\" >True</td>\n", + " <td id=\"T_56dd5_row57_col5\" class=\"data row57 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row57_col6\" class=\"data row57 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", + " <td id=\"T_56dd5_row57_col7\" class=\"data row57 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row57_col8\" class=\"data row57 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row58_col0\" class=\"data row58 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", + " <td id=\"T_56dd5_row58_col1\" class=\"data row58 col1\" >Lilliefors</td>\n", + " <td id=\"T_56dd5_row58_col2\" class=\"data row58 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", + " <td id=\"T_56dd5_row58_col3\" class=\"data row58 col3\" >False</td>\n", + " <td id=\"T_56dd5_row58_col4\" class=\"data row58 col4\" >True</td>\n", + " <td id=\"T_56dd5_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row58_col6\" class=\"data row58 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row58_col7\" class=\"data row58 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row58_col8\" class=\"data row58 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row59_col0\" class=\"data row59 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", + " <td id=\"T_56dd5_row59_col1\" class=\"data row59 col1\" >Prediction Probabilities Histogram</td>\n", + " <td id=\"T_56dd5_row59_col2\" class=\"data row59 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", + " <td id=\"T_56dd5_row59_col3\" class=\"data row59 col3\" >True</td>\n", + " <td id=\"T_56dd5_row59_col4\" class=\"data row59 col4\" >False</td>\n", + " <td id=\"T_56dd5_row59_col5\" class=\"data row59 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row59_col6\" class=\"data row59 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", + " <td id=\"T_56dd5_row59_col7\" class=\"data row59 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row59_col8\" class=\"data row59 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row60_col0\" class=\"data row60 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", + " <td id=\"T_56dd5_row60_col1\" class=\"data row60 col1\" >Scorecard Histogram</td>\n", + " <td id=\"T_56dd5_row60_col2\" class=\"data row60 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", + " <td id=\"T_56dd5_row60_col3\" class=\"data row60 col3\" >True</td>\n", + " <td id=\"T_56dd5_row60_col4\" class=\"data row60 col4\" >False</td>\n", + " <td id=\"T_56dd5_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row60_col6\" class=\"data row60 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", + " <td id=\"T_56dd5_row60_col7\" class=\"data row60 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_56dd5_row60_col8\" class=\"data row60 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row61_col0\" class=\"data row61 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_56dd5_row61_col1\" class=\"data row61 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_56dd5_row61_col2\" class=\"data row61 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row61_col3\" class=\"data row61 col3\" >True</td>\n", + " <td id=\"T_56dd5_row61_col4\" class=\"data row61 col4\" >True</td>\n", + " <td id=\"T_56dd5_row61_col5\" class=\"data row61 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row61_col6\" class=\"data row61 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row61_col7\" class=\"data row61 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row61_col8\" class=\"data row61 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row62_col0\" class=\"data row62 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_56dd5_row62_col1\" class=\"data row62 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_56dd5_row62_col2\" class=\"data row62 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row62_col3\" class=\"data row62 col3\" >False</td>\n", + " <td id=\"T_56dd5_row62_col4\" class=\"data row62 col4\" >True</td>\n", + " <td id=\"T_56dd5_row62_col5\" class=\"data row62 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row62_col6\" class=\"data row62 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row62_col7\" class=\"data row62 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row62_col8\" class=\"data row62 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row63_col0\" class=\"data row63 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", + " <td id=\"T_56dd5_row63_col1\" class=\"data row63 col1\" >Class Imbalance Drift</td>\n", + " <td id=\"T_56dd5_row63_col2\" class=\"data row63 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row63_col3\" class=\"data row63 col3\" >True</td>\n", + " <td id=\"T_56dd5_row63_col4\" class=\"data row63 col4\" >True</td>\n", + " <td id=\"T_56dd5_row63_col5\" class=\"data row63 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row63_col6\" class=\"data row63 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", + " <td id=\"T_56dd5_row63_col7\" class=\"data row63 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", + " <td id=\"T_56dd5_row63_col8\" class=\"data row63 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row64_col0\" class=\"data row64 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_56dd5_row64_col1\" class=\"data row64 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_56dd5_row64_col2\" class=\"data row64 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row64_col3\" class=\"data row64 col3\" >False</td>\n", + " <td id=\"T_56dd5_row64_col4\" class=\"data row64 col4\" >True</td>\n", + " <td id=\"T_56dd5_row64_col5\" class=\"data row64 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row64_col6\" class=\"data row64 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row64_col7\" class=\"data row64 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row64_col8\" class=\"data row64 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row65_col0\" class=\"data row65 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_56dd5_row65_col1\" class=\"data row65 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_56dd5_row65_col2\" class=\"data row65 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row65_col3\" class=\"data row65 col3\" >False</td>\n", + " <td id=\"T_56dd5_row65_col4\" class=\"data row65 col4\" >True</td>\n", + " <td id=\"T_56dd5_row65_col5\" class=\"data row65 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row65_col6\" class=\"data row65 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row65_col7\" class=\"data row65 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row65_col8\" class=\"data row65 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row66_col0\" class=\"data row66 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", + " <td id=\"T_56dd5_row66_col1\" class=\"data row66 col1\" >Cumulative Prediction Probabilities Drift</td>\n", + " <td id=\"T_56dd5_row66_col2\" class=\"data row66 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row66_col3\" class=\"data row66 col3\" >True</td>\n", + " <td id=\"T_56dd5_row66_col4\" class=\"data row66 col4\" >False</td>\n", + " <td id=\"T_56dd5_row66_col5\" class=\"data row66 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row66_col6\" class=\"data row66 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row66_col7\" class=\"data row66 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row66_col8\" class=\"data row66 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row67_col0\" class=\"data row67 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", + " <td id=\"T_56dd5_row67_col1\" class=\"data row67 col1\" >Prediction Probabilities Histogram Drift</td>\n", + " <td id=\"T_56dd5_row67_col2\" class=\"data row67 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row67_col3\" class=\"data row67 col3\" >True</td>\n", + " <td id=\"T_56dd5_row67_col4\" class=\"data row67 col4\" >True</td>\n", + " <td id=\"T_56dd5_row67_col5\" class=\"data row67 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row67_col6\" class=\"data row67 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row67_col7\" class=\"data row67 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row67_col8\" class=\"data row67 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row68_col0\" class=\"data row68 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_56dd5_row68_col1\" class=\"data row68 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_56dd5_row68_col2\" class=\"data row68 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row68_col3\" class=\"data row68 col3\" >True</td>\n", + " <td id=\"T_56dd5_row68_col4\" class=\"data row68 col4\" >False</td>\n", + " <td id=\"T_56dd5_row68_col5\" class=\"data row68 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row68_col6\" class=\"data row68 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row68_col7\" class=\"data row68 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row68_col8\" class=\"data row68 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row69_col0\" class=\"data row69 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", + " <td id=\"T_56dd5_row69_col1\" class=\"data row69 col1\" >Score Bands Drift</td>\n", + " <td id=\"T_56dd5_row69_col2\" class=\"data row69 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", + " <td id=\"T_56dd5_row69_col3\" class=\"data row69 col3\" >False</td>\n", + " <td id=\"T_56dd5_row69_col4\" class=\"data row69 col4\" >True</td>\n", + " <td id=\"T_56dd5_row69_col5\" class=\"data row69 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row69_col6\" class=\"data row69 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row69_col7\" class=\"data row69 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_56dd5_row69_col8\" class=\"data row69 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row70_col0\" class=\"data row70 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", + " <td id=\"T_56dd5_row70_col1\" class=\"data row70 col1\" >Scorecard Histogram Drift</td>\n", + " <td id=\"T_56dd5_row70_col2\" class=\"data row70 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", + " <td id=\"T_56dd5_row70_col3\" class=\"data row70 col3\" >True</td>\n", + " <td id=\"T_56dd5_row70_col4\" class=\"data row70 col4\" >True</td>\n", + " <td id=\"T_56dd5_row70_col5\" class=\"data row70 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row70_col6\" class=\"data row70 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row70_col7\" class=\"data row70 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_56dd5_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row71_col0\" class=\"data row71 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", + " <td id=\"T_56dd5_row71_col1\" class=\"data row71 col1\" >Accuracy</td>\n", + " <td id=\"T_56dd5_row71_col2\" class=\"data row71 col2\" >Calculates the accuracy of a model</td>\n", + " <td id=\"T_56dd5_row71_col3\" class=\"data row71 col3\" >False</td>\n", + " <td id=\"T_56dd5_row71_col4\" class=\"data row71 col4\" >False</td>\n", + " <td id=\"T_56dd5_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row71_col6\" class=\"data row71 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row71_col7\" class=\"data row71 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row71_col8\" class=\"data row71 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row72_col0\" class=\"data row72 col0\" >validmind.unit_metrics.classification.F1</td>\n", + " <td id=\"T_56dd5_row72_col1\" class=\"data row72 col1\" >F1</td>\n", + " <td id=\"T_56dd5_row72_col2\" class=\"data row72 col2\" >Calculates the F1 score for a classification model.</td>\n", + " <td id=\"T_56dd5_row72_col3\" class=\"data row72 col3\" >False</td>\n", + " <td id=\"T_56dd5_row72_col4\" class=\"data row72 col4\" >False</td>\n", + " <td id=\"T_56dd5_row72_col5\" class=\"data row72 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row72_col6\" class=\"data row72 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row72_col7\" class=\"data row72 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row72_col8\" class=\"data row72 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row73_col0\" class=\"data row73 col0\" >validmind.unit_metrics.classification.Precision</td>\n", + " <td id=\"T_56dd5_row73_col1\" class=\"data row73 col1\" >Precision</td>\n", + " <td id=\"T_56dd5_row73_col2\" class=\"data row73 col2\" >Calculates the precision for a classification model.</td>\n", + " <td id=\"T_56dd5_row73_col3\" class=\"data row73 col3\" >False</td>\n", + " <td id=\"T_56dd5_row73_col4\" class=\"data row73 col4\" >False</td>\n", + " <td id=\"T_56dd5_row73_col5\" class=\"data row73 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row73_col6\" class=\"data row73 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row73_col7\" class=\"data row73 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row73_col8\" class=\"data row73 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row74_col0\" class=\"data row74 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", + " <td id=\"T_56dd5_row74_col1\" class=\"data row74 col1\" >ROC AUC</td>\n", + " <td id=\"T_56dd5_row74_col2\" class=\"data row74 col2\" >Calculates the ROC AUC for a classification model.</td>\n", + " <td id=\"T_56dd5_row74_col3\" class=\"data row74 col3\" >False</td>\n", + " <td id=\"T_56dd5_row74_col4\" class=\"data row74 col4\" >False</td>\n", + " <td id=\"T_56dd5_row74_col5\" class=\"data row74 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row74_col6\" class=\"data row74 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row74_col7\" class=\"data row74 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row74_col8\" class=\"data row74 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row75_col0\" class=\"data row75 col0\" >validmind.unit_metrics.classification.Recall</td>\n", + " <td id=\"T_56dd5_row75_col1\" class=\"data row75 col1\" >Recall</td>\n", + " <td id=\"T_56dd5_row75_col2\" class=\"data row75 col2\" >Calculates the recall for a classification model.</td>\n", + " <td id=\"T_56dd5_row75_col3\" class=\"data row75 col3\" >False</td>\n", + " <td id=\"T_56dd5_row75_col4\" class=\"data row75 col4\" >False</td>\n", + " <td id=\"T_56dd5_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row75_col6\" class=\"data row75 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row75_col7\" class=\"data row75 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row75_col8\" class=\"data row75 col8\" >['classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x10516c880>" + ] + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `tags` parameter to find tests based on their tags, such as `model_performance` or `visualization`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tests(tags=[\"model_performance\", \"visualization\"])" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_4d8bf th {\n", + " text-align: left;\n", + "}\n", + "#T_4d8bf_row0_col0, #T_4d8bf_row0_col1, #T_4d8bf_row0_col2, #T_4d8bf_row0_col3, #T_4d8bf_row0_col4, #T_4d8bf_row0_col5, #T_4d8bf_row0_col6, #T_4d8bf_row0_col7, #T_4d8bf_row0_col8, #T_4d8bf_row1_col0, #T_4d8bf_row1_col1, #T_4d8bf_row1_col2, #T_4d8bf_row1_col3, #T_4d8bf_row1_col4, #T_4d8bf_row1_col5, #T_4d8bf_row1_col6, #T_4d8bf_row1_col7, #T_4d8bf_row1_col8, #T_4d8bf_row2_col0, #T_4d8bf_row2_col1, #T_4d8bf_row2_col2, #T_4d8bf_row2_col3, #T_4d8bf_row2_col4, #T_4d8bf_row2_col5, #T_4d8bf_row2_col6, #T_4d8bf_row2_col7, #T_4d8bf_row2_col8, #T_4d8bf_row3_col0, #T_4d8bf_row3_col1, #T_4d8bf_row3_col2, #T_4d8bf_row3_col3, #T_4d8bf_row3_col4, #T_4d8bf_row3_col5, #T_4d8bf_row3_col6, #T_4d8bf_row3_col7, #T_4d8bf_row3_col8, #T_4d8bf_row4_col0, #T_4d8bf_row4_col1, #T_4d8bf_row4_col2, #T_4d8bf_row4_col3, #T_4d8bf_row4_col4, #T_4d8bf_row4_col5, #T_4d8bf_row4_col6, #T_4d8bf_row4_col7, #T_4d8bf_row4_col8, #T_4d8bf_row5_col0, #T_4d8bf_row5_col1, #T_4d8bf_row5_col2, #T_4d8bf_row5_col3, #T_4d8bf_row5_col4, #T_4d8bf_row5_col5, #T_4d8bf_row5_col6, #T_4d8bf_row5_col7, #T_4d8bf_row5_col8, #T_4d8bf_row6_col0, #T_4d8bf_row6_col1, #T_4d8bf_row6_col2, #T_4d8bf_row6_col3, #T_4d8bf_row6_col4, #T_4d8bf_row6_col5, #T_4d8bf_row6_col6, #T_4d8bf_row6_col7, #T_4d8bf_row6_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_4d8bf\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_4d8bf_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_4d8bf_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_4d8bf_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_4d8bf_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_4d8bf_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_4d8bf_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_4d8bf_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_4d8bf_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_4d8bf_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", + " <td id=\"T_4d8bf_row0_col1\" class=\"data row0 col1\" >Regression Residuals Plot</td>\n", + " <td id=\"T_4d8bf_row0_col2\" class=\"data row0 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", + " <td id=\"T_4d8bf_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row0_col5\" class=\"data row0 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row0_col6\" class=\"data row0 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_4d8bf_row0_col7\" class=\"data row0 col7\" >['model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_4d8bf_row1_col1\" class=\"data row1 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_4d8bf_row1_col2\" class=\"data row1 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_4d8bf_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row1_col4\" class=\"data row1 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_4d8bf_row1_col6\" class=\"data row1 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_4d8bf_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_4d8bf_row2_col1\" class=\"data row2 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_4d8bf_row2_col2\" class=\"data row2 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_4d8bf_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row2_col4\" class=\"data row2 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_4d8bf_row3_col1\" class=\"data row3 col1\" >ROC Curve</td>\n", + " <td id=\"T_4d8bf_row3_col2\" class=\"data row3 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_4d8bf_row3_col3\" class=\"data row3 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row3_col4\" class=\"data row3 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_4d8bf_row4_col1\" class=\"data row4 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_4d8bf_row4_col2\" class=\"data row4 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_4d8bf_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_4d8bf_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_4d8bf_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row4_col6\" class=\"data row4 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_4d8bf_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_4d8bf_row5_col1\" class=\"data row5 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_4d8bf_row5_col2\" class=\"data row5 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_4d8bf_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_4d8bf_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row5_col6\" class=\"data row5 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_4d8bf_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row6_col0\" class=\"data row6 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_4d8bf_row6_col1\" class=\"data row6 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_4d8bf_row6_col2\" class=\"data row6 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_4d8bf_row6_col3\" class=\"data row6 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row6_col4\" class=\"data row6 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row6_col5\" class=\"data row6 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row6_col6\" class=\"data row6 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row6_col7\" class=\"data row6 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row6_col8\" class=\"data row6 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x36a280f40>" + ] + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use `filter`, `task`, and `tags` together to create more specific queries.\n", + "\n", + "For example, apply all three to find tests compatible with `sklearn` models, designed for `classification` tasks:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "list_tests(filter=\"sklearn\",\n", + " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", + ")" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_36394 th {\n", + " text-align: left;\n", + "}\n", + "#T_36394_row0_col0, #T_36394_row0_col1, #T_36394_row0_col2, #T_36394_row0_col3, #T_36394_row0_col4, #T_36394_row0_col5, #T_36394_row0_col6, #T_36394_row0_col7, #T_36394_row0_col8, #T_36394_row1_col0, #T_36394_row1_col1, #T_36394_row1_col2, #T_36394_row1_col3, #T_36394_row1_col4, #T_36394_row1_col5, #T_36394_row1_col6, #T_36394_row1_col7, #T_36394_row1_col8, #T_36394_row2_col0, #T_36394_row2_col1, #T_36394_row2_col2, #T_36394_row2_col3, #T_36394_row2_col4, #T_36394_row2_col5, #T_36394_row2_col6, #T_36394_row2_col7, #T_36394_row2_col8, #T_36394_row3_col0, #T_36394_row3_col1, #T_36394_row3_col2, #T_36394_row3_col3, #T_36394_row3_col4, #T_36394_row3_col5, #T_36394_row3_col6, #T_36394_row3_col7, #T_36394_row3_col8, #T_36394_row4_col0, #T_36394_row4_col1, #T_36394_row4_col2, #T_36394_row4_col3, #T_36394_row4_col4, #T_36394_row4_col5, #T_36394_row4_col6, #T_36394_row4_col7, #T_36394_row4_col8, #T_36394_row5_col0, #T_36394_row5_col1, #T_36394_row5_col2, #T_36394_row5_col3, #T_36394_row5_col4, #T_36394_row5_col5, #T_36394_row5_col6, #T_36394_row5_col7, #T_36394_row5_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_36394\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_36394_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_36394_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_36394_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_36394_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_36394_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_36394_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_36394_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_36394_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_36394_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_36394_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_36394_row0_col1\" class=\"data row0 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_36394_row0_col2\" class=\"data row0 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_36394_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_36394_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_36394_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_36394_row0_col6\" class=\"data row0 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_36394_row0_col7\" class=\"data row0 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row0_col8\" class=\"data row0 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_36394_row1_col1\" class=\"data row1 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_36394_row1_col2\" class=\"data row1 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_36394_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_36394_row1_col4\" class=\"data row1 col4\" >False</td>\n", + " <td id=\"T_36394_row1_col5\" class=\"data row1 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_36394_row1_col6\" class=\"data row1 col6\" >{}</td>\n", + " <td id=\"T_36394_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_36394_row2_col1\" class=\"data row2 col1\" >ROC Curve</td>\n", + " <td id=\"T_36394_row2_col2\" class=\"data row2 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_36394_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_36394_row2_col4\" class=\"data row2 col4\" >False</td>\n", + " <td id=\"T_36394_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_36394_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_36394_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_36394_row3_col1\" class=\"data row3 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_36394_row3_col2\" class=\"data row3 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_36394_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_36394_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_36394_row3_col5\" class=\"data row3 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row3_col6\" class=\"data row3 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_36394_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row4_col0\" class=\"data row4 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_36394_row4_col1\" class=\"data row4 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_36394_row4_col2\" class=\"data row4 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_36394_row4_col3\" class=\"data row4 col3\" >True</td>\n", + " <td id=\"T_36394_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_36394_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_36394_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_36394_row5_col1\" class=\"data row5 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_36394_row5_col2\" class=\"data row5 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_36394_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_36394_row5_col4\" class=\"data row5 col4\" >False</td>\n", + " <td id=\"T_36394_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_36394_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x380009c40>" + ] + } + } + ] + }, { - "data": { - "text/plain": [ - "['validmind.data_validation.DatasetDescription',\n", - " 'validmind.data_validation.DatasetSplit',\n", - " 'validmind.data_validation.nlp.CommonWords',\n", - " 'validmind.data_validation.nlp.Hashtags',\n", - " 'validmind.data_validation.nlp.LanguageDetection',\n", - " 'validmind.data_validation.nlp.Mentions',\n", - " 'validmind.data_validation.nlp.Punctuations',\n", - " 'validmind.data_validation.nlp.StopWords',\n", - " 'validmind.data_validation.nlp.TextDescription',\n", - " 'validmind.model_validation.BertScore',\n", - " 'validmind.model_validation.BleuScore',\n", - " 'validmind.model_validation.ContextualRecall',\n", - " 'validmind.model_validation.MeteorScore',\n", - " 'validmind.model_validation.RegardScore',\n", - " 'validmind.model_validation.RougeScore',\n", - " 'validmind.model_validation.TokenDisparity',\n", - " 'validmind.model_validation.ToxicityScore',\n", - " 'validmind.model_validation.embeddings.CosineSimilarityComparison',\n", - " 'validmind.model_validation.embeddings.CosineSimilarityHeatmap',\n", - " 'validmind.model_validation.embeddings.EuclideanDistanceComparison',\n", - " 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap',\n", - " 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots',\n", - " 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots',\n", - " 'validmind.model_validation.ragas.AnswerCorrectness',\n", - " 'validmind.model_validation.ragas.AspectCritic',\n", - " 'validmind.model_validation.ragas.ContextEntityRecall',\n", - " 'validmind.model_validation.ragas.ContextPrecision',\n", - " 'validmind.model_validation.ragas.ContextPrecisionWithoutReference',\n", - " 'validmind.model_validation.ragas.ContextRecall',\n", - " 'validmind.model_validation.ragas.Faithfulness',\n", - " 'validmind.model_validation.ragas.NoiseSensitivity',\n", - " 'validmind.model_validation.ragas.ResponseRelevancy',\n", - " 'validmind.model_validation.ragas.SemanticSimilarity',\n", - " 'validmind.prompt_validation.Bias',\n", - " 'validmind.prompt_validation.Clarity',\n", - " 'validmind.prompt_validation.Conciseness',\n", - " 'validmind.prompt_validation.Delimitation',\n", - " 'validmind.prompt_validation.NegativeInstruction',\n", - " 'validmind.prompt_validation.Robustness',\n", - " 'validmind.prompt_validation.Specificity']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Store test sets for use\n", + "\n", + "Once you've identified specific sets of tests you'd like to run, you can store the tests in variables, enabling you to easily reuse those tests in later steps.\n", + "\n", + "For example, if you're validating a summarization model, use [`list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all tests tagged for text summarization and save them to `text_summarization_tests` for later use:" ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", + "text_summarization_tests" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "['validmind.data_validation.DatasetDescription',\n", + " 'validmind.data_validation.DatasetSplit',\n", + " 'validmind.data_validation.nlp.CommonWords',\n", + " 'validmind.data_validation.nlp.Hashtags',\n", + " 'validmind.data_validation.nlp.LanguageDetection',\n", + " 'validmind.data_validation.nlp.Mentions',\n", + " 'validmind.data_validation.nlp.Punctuations',\n", + " 'validmind.data_validation.nlp.StopWords',\n", + " 'validmind.data_validation.nlp.TextDescription',\n", + " 'validmind.model_validation.BertScore',\n", + " 'validmind.model_validation.BleuScore',\n", + " 'validmind.model_validation.ContextualRecall',\n", + " 'validmind.model_validation.MeteorScore',\n", + " 'validmind.model_validation.RegardScore',\n", + " 'validmind.model_validation.RougeScore',\n", + " 'validmind.model_validation.TokenDisparity',\n", + " 'validmind.model_validation.ToxicityScore',\n", + " 'validmind.model_validation.embeddings.CosineSimilarityComparison',\n", + " 'validmind.model_validation.embeddings.CosineSimilarityHeatmap',\n", + " 'validmind.model_validation.embeddings.EuclideanDistanceComparison',\n", + " 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap',\n", + " 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots',\n", + " 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots',\n", + " 'validmind.model_validation.ragas.AnswerCorrectness',\n", + " 'validmind.model_validation.ragas.AspectCritic',\n", + " 'validmind.model_validation.ragas.ContextEntityRecall',\n", + " 'validmind.model_validation.ragas.ContextPrecision',\n", + " 'validmind.model_validation.ragas.ContextPrecisionWithoutReference',\n", + " 'validmind.model_validation.ragas.ContextRecall',\n", + " 'validmind.model_validation.ragas.Faithfulness',\n", + " 'validmind.model_validation.ragas.NoiseSensitivity',\n", + " 'validmind.model_validation.ragas.ResponseRelevancy',\n", + " 'validmind.model_validation.ragas.SemanticSimilarity',\n", + " 'validmind.prompt_validation.Bias',\n", + " 'validmind.prompt_validation.Clarity',\n", + " 'validmind.prompt_validation.Conciseness',\n", + " 'validmind.prompt_validation.Delimitation',\n", + " 'validmind.prompt_validation.NegativeInstruction',\n", + " 'validmind.prompt_validation.Robustness',\n", + " 'validmind.prompt_validation.Specificity']" + ] + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know how to browse and filter tests in the ValidMind Library, you’re ready to take the next step. Use the test IDs you’ve selected to either run individual tests or batch run them with custom test suites.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn about the tests suites available in the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_test_suites.html\" style=\"color: #DE257E;\"><b>Explore test suites</b></a> notebook for more code examples and usage of key functions.</div>\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-fb6994d364c54669b356f7a2278d6480" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" } - ], - "source": [ - "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", - "text_summarization_tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know how to browse and filter tests in the ValidMind Library, you’re ready to take the next step. Use the test IDs you’ve selected to either run individual tests or batch run them with custom test suites.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn about the tests suites available in the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_test_suites.html\" style=\"color: #DE257E;\"><b>Explore test suites</b></a> notebook for more code examples and usage of key functions.</div>\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-fb6994d364c54669b356f7a2278d6480", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} + "nbformat": 4, + "nbformat_minor": 4 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index d4f7812c0..2baaa881d 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -1,781 +1,785 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "976bb3d9", - "metadata": {}, - "source": [ - "# Run dataset-based tests\n", - "\n", - "Learn how to use the ValidMind Library to run tests that take any dataset or record (model) as input. Identify specific tests to run, initialize ValidMind dataset objects in preparation for passing them to your tests, and then run the chosen tests — generating outputs that can be automatically logged to your documentation in the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "id": "8c4d9b9c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Explore a ValidMind test](#toc3__) \n", - "- [Working with ValidMind datasets](#toc4__) \n", - " - [Create a sample dataset](#toc4_1__) \n", - " - [Initialize the ValidMind dataset](#toc4_2__) \n", - "- [Running ValidMind tests](#toc5__) \n", - " - [Run test using ValidMind dataset](#toc5_1__) \n", - " - [Run and log test requiring parameters](#toc5_2__) \n", - " - [Log ClassImbalance test with default parameters](#toc5_2_1__) \n", - " - [Log ClassImbalance test with custom paramaters](#toc5_2_2__) \n", - "- [Work with test results](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Discover more learning resources](#toc7_1__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "f49237b3", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "907737bd", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "115cdfa7", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "c3051ca8", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "656db165", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "30fa24d7", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "524602cc", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b38fc5f6", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "451c5a1b", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "0e55ac40", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "3545620d", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0ed9e84d", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "8fea9380", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e44a2345", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "43ee2f43", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Explore a ValidMind test\n", - "\n", - "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", - "\n", - "Let's assume you want to generate the *pearson correlation matrix* for a dataset. A Pearson correlation matrix is a table that shows the [Pearson correlation coefficients](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) between several variables.\n", - "\n", - "We'll pass in a `filter` to the `list_tests` function to find the test ID for the pearson correlation matrix:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a63e7a43", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"PearsonCorrelationMatrix\")" - ] - }, - { - "cell_type": "markdown", - "id": "011de751", - "metadata": {}, - "source": [ - "We've identified from the output that the test ID for the pearson correlation matrix test is `validmind.data_validation.PearsonCorrelationMatrix`.\n", - "\n", - "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9886cd27", - "metadata": {}, - "outputs": [], - "source": [ - "test_id = \"validmind.data_validation.PearsonCorrelationMatrix\"\n", - "vm.tests.describe_test(test_id)" - ] - }, - { - "cell_type": "markdown", - "id": "f1f7a84a", - "metadata": {}, - "source": [ - "Since this test requires a dataset, you can expect it to throw an error when we run it without passing in a `dataset` as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ee38704a", - "metadata": {}, - "outputs": [], - "source": [ - "try:\n", - " vm.tests.run_test(test_id)\n", - "except Exception as e:\n", - " print(e)" - ] - }, - { - "cell_type": "markdown", - "id": "60ede8e0", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "6bcd01d2", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "35331764", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Create a sample dataset\n", - "\n", - "Since we need a dataset to run tests, let's use the [sklearn `make_classification` function](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html) to generate a random sample dataset for testing.\n", - "\n", - "In the code example below, note that:\n", - "\n", - "- The `make_classification` function generates a synthetic binary classification dataset with `10,000` samples and `10` features, where the `weights=[0.1]` parameter creates a class imbalance (roughly 10% positive class).\n", - "- The `random_state=42` parameter ensures reproducibility so you get the same dataset each time you run the code.\n", - "- The generated feature matrix `X` and target array `y` are combined into a single [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with columns named `feature_0` through `feature_9`, plus a `target` column that has a value of `1` for the positive class and `0` otherwise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "25774f44", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "from sklearn.datasets import make_classification\n", - "\n", - "X, y = make_classification(\n", - " n_samples=10000,\n", - " n_features=10,\n", - " weights=[0.1],\n", - " random_state=42,\n", - ")\n", - "X.shape\n", - "y.shape\n", - "\n", - "df = pd.DataFrame(X, columns=[f\"feature_{i}\" for i in range(X.shape[1])])\n", - "df[\"target\"] = y\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "3b3032fc", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind dataset\n", - "\n", - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "70c52c03", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the ValidMind dataset for the previously created sample `df`\n", - "vm_dataset = vm.init_dataset(\n", - " df,\n", - " input_id=\"my_demo_dataset\",\n", - " target_column=\"target\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "ec65df1b", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running ValidMind tests\n", - "\n", - "Now that we know how to initialize a ValidMind `dataset` object, we're ready to run some tests!\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", - "\n", - "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", - "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." - ] - }, - { - "cell_type": "markdown", - "id": "c46789a4", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Run test using ValidMind dataset\n", - "\n", - "Given that our `test_id` is currently set to `validmind.data_validation.PearsonCorrelationMatrix`, we'll get the results of the Pearson Correlation Matrix test as output when we call `run_test()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0c636915", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " test_id,\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "12694f87", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Run and log test requiring parameters\n", - "\n", - "Our `vm_dataset` can also be used for any other test that requires a dataset input, including tests that take additional parameters.\n", - "\n", - "Let's find a *class imbalance* test to understand the distribution of the target column in the dataset to demonstrate. Class imbalance is a common problem in machine learning, particularly in classification tasks, where the number of instances (or data points) in each class isn't evenly distributed across the available categories.\n", - "\n", - "`Tags` describe what a test applies to and help you filter tests for your use case. Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "99eaf2da", - "metadata": {}, - "outputs": [], - "source": [ - "# Sort the tags in ABC order\n", - "sorted(vm.tests.list_tags())" - ] - }, - { - "cell_type": "markdown", - "id": "561b225a", - "metadata": {}, - "source": [ - "Use `list_tests()`, this time filtering tests by tags for `binary_classification` relating to `tabular_data`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "97a45b6b", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"binary_classification\", \"tabular_data\"])" - ] - }, - { - "cell_type": "markdown", - "id": "4ba2ec07", - "metadata": {}, - "source": [ - "Let's use `describe_test()` again to retrieve more information about the test, including confirmation that it accepts some additional parameters, such as `min_percent_threshold` which allows you configure the threshold for an acceptable class imbalance:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ec456cd2", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.describe_test(\"validmind.data_validation.ClassImbalance\")" - ] - }, - { - "cell_type": "markdown", - "id": "e419dd51", - "metadata": {}, - "source": [ - "<a id='toc5_2_1__'></a>\n", - "\n", - "#### Log ClassImbalance test with default parameters\n", - "\n", - "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Let's first run the class imbalance test without any parameters to see its output using a default value for the threshold and log the results to the ValidMind Platform for later comparison:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1c137483", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")\n", - "\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "6cc499de", - "metadata": {}, - "source": [ - "<a id='toc5_2_2__'></a>\n", - "\n", - "#### Log ClassImbalance test with custom paramaters\n", - "\n", - "From the output, we've confirmed that the class imbalance test passes the pass-fail criteria with the default threshold of 10%. Let's try to run the test with a threshold of 20% to see if it fails.\n", - "\n", - "When running individual tests, **you can use a custom `result_id` to tag the individual result with a unique identifier**, allowing you to submit individual results for the same test to the ValidMind Platform:\n", - "\n", - "- This `result_id` can be appended to `test_id` with a `:` separator.\n", - "- The `custom_threshold` identifier will correspond with the results of our adjusted `min_percent_threshold` parameter." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c6f19ad", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance:custom_threshold\",\n", - " inputs={\"dataset\": vm_dataset},\n", - " params={\"min_percent_threshold\": 20},\n", - ")\n", - "\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "30e82fc3", - "metadata": {}, - "source": [ - "When the threshold is set to 20%, the results show that the class imbalance test fails." - ] - }, - { - "cell_type": "markdown", - "id": "faa09935", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Work with test results\n", - "\n", - "You can look at the output of tests produced by the ValidMind Library right in this notebook where you ran the tests, as you would expect. But there is a better way — use the ValidMind Platform to attach the logged test results your documentation (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Locate the Data Preparation section and click on **2.1. Data Description** to expand that section.\n", - "\n", - "4. Hover under the logged test block for the default Class Imbalance test until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", - "\n", - "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", - "\n", - " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", - " - Select `ClassImbalance:custom_threshold` as the test.\n", - "\n", - "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", - "\n", - " Confirm that the individual results for the adjusted threshold class imbalance test has been correctly inserted into section **2.1. Data Description** of the documentation.\n", - "\n", - "You just worked with a draft of your model's documentation, in an easily consumable format matching the structure of the template you previewed in the beginning of this notebook. When you connect to a model with the ValidMind Library, logged test results automatically populate for easy insertion into your documentation.\n", - "\n", - "In the ValidMind Platform, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "cbe20d76", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know the basics of how to run out-of-the-box tests in the ValidMind Library, you’re ready to take the next step. Use `run_test()` with any combination of datasets or records (models) as inputs to run comparison tests, and log your consolidated test results to the ValidMind Platform.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to run comparison tests with the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/2-run_comparison_tests.html\" style=\"color: #DE257E;\"><b>Run comparison tests</b></a> notebook for code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "ec08c9bc", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "bff625a1", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b5f64e27", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "da29fb9d", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "82837a85", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-38501808b29c456ab97562eebdd497d4", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run dataset-based tests\n", + "\n", + "Learn how to use the ValidMind Library to run tests that take any dataset or record (model) as input. Identify specific tests to run, initialize ValidMind dataset objects in preparation for passing them to your tests, and then run the chosen tests — generating outputs that can be automatically logged to your documentation in the ValidMind Platform." + ], + "id": "976bb3d9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Explore a ValidMind test](#toc3__) \n", + "- [Working with ValidMind datasets](#toc4__) \n", + " - [Create a sample dataset](#toc4_1__) \n", + " - [Initialize the ValidMind dataset](#toc4_2__) \n", + "- [Running ValidMind tests](#toc5__) \n", + " - [Run test using ValidMind dataset](#toc5_1__) \n", + " - [Run and log test requiring parameters](#toc5_2__) \n", + " - [Log ClassImbalance test with default parameters](#toc5_2_1__) \n", + " - [Log ClassImbalance test with custom paramaters](#toc5_2_2__) \n", + "- [Work with test results](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Discover more learning resources](#toc7_1__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "8c4d9b9c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "f49237b3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "907737bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "115cdfa7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "c3051ca8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "656db165" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "30fa24d7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "524602cc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "b38fc5f6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "451c5a1b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "0e55ac40" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3545620d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0ed9e84d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "8fea9380" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "e44a2345" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Explore a ValidMind test\n", + "\n", + "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", + "\n", + "Let's assume you want to generate the *pearson correlation matrix* for a dataset. A Pearson correlation matrix is a table that shows the [Pearson correlation coefficients](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) between several variables.\n", + "\n", + "We'll pass in a `filter` to the `list_tests` function to find the test ID for the pearson correlation matrix:" + ], + "id": "43ee2f43" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"PearsonCorrelationMatrix\")" + ], + "execution_count": null, + "outputs": [], + "id": "a63e7a43" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've identified from the output that the test ID for the pearson correlation matrix test is `validmind.data_validation.PearsonCorrelationMatrix`.\n", + "\n", + "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" + ], + "id": "011de751" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_id = \"validmind.data_validation.PearsonCorrelationMatrix\"\n", + "vm.tests.describe_test(test_id)" + ], + "execution_count": null, + "outputs": [], + "id": "9886cd27" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since this test requires a dataset, you can expect it to throw an error when we run it without passing in a `dataset` as input:" + ], + "id": "f1f7a84a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "try:\n", + " vm.tests.run_test(test_id)\n", + "except Exception as e:\n", + " print(e)" + ], + "execution_count": null, + "outputs": [], + "id": "ee38704a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" + ], + "id": "60ede8e0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "6bcd01d2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Create a sample dataset\n", + "\n", + "Since we need a dataset to run tests, let's use the [sklearn `make_classification` function](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html) to generate a random sample dataset for testing.\n", + "\n", + "In the code example below, note that:\n", + "\n", + "- The `make_classification` function generates a synthetic binary classification dataset with `10,000` samples and `10` features, where the `weights=[0.1]` parameter creates a class imbalance (roughly 10% positive class).\n", + "- The `random_state=42` parameter ensures reproducibility so you get the same dataset each time you run the code.\n", + "- The generated feature matrix `X` and target array `y` are combined into a single [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with columns named `feature_0` through `feature_9`, plus a `target` column that has a value of `1` for the positive class and `0` otherwise." + ], + "id": "35331764" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "from sklearn.datasets import make_classification\n", + "\n", + "X, y = make_classification(\n", + " n_samples=10000,\n", + " n_features=10,\n", + " weights=[0.1],\n", + " random_state=42,\n", + ")\n", + "X.shape\n", + "y.shape\n", + "\n", + "df = pd.DataFrame(X, columns=[f\"feature_{i}\" for i in range(X.shape[1])])\n", + "df[\"target\"] = y\n", + "df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "25774f44" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind dataset\n", + "\n", + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "3b3032fc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the ValidMind dataset for the previously created sample `df`\n", + "vm_dataset = vm.init_dataset(\n", + " df,\n", + " input_id=\"my_demo_dataset\",\n", + " target_column=\"target\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "70c52c03" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running ValidMind tests\n", + "\n", + "Now that we know how to initialize a ValidMind `dataset` object, we're ready to run some tests!\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", + "\n", + "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", + "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." + ], + "id": "ec65df1b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Run test using ValidMind dataset\n", + "\n", + "Given that our `test_id` is currently set to `validmind.data_validation.PearsonCorrelationMatrix`, we'll get the results of the Pearson Correlation Matrix test as output when we call `run_test()`:" + ], + "id": "c46789a4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " test_id,\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0c636915" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Run and log test requiring parameters\n", + "\n", + "Our `vm_dataset` can also be used for any other test that requires a dataset input, including tests that take additional parameters.\n", + "\n", + "Let's find a *class imbalance* test to understand the distribution of the target column in the dataset to demonstrate. Class imbalance is a common problem in machine learning, particularly in classification tasks, where the number of instances (or data points) in each class isn't evenly distributed across the available categories.\n", + "\n", + "`Tags` describe what a test applies to and help you filter tests for your use case. Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library:" + ], + "id": "12694f87" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Sort the tags in ABC order\n", + "sorted(vm.tests.list_tags())" + ], + "execution_count": null, + "outputs": [], + "id": "99eaf2da" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use `list_tests()`, this time filtering tests by tags for `binary_classification` relating to `tabular_data`:" + ], + "id": "561b225a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"binary_classification\", \"tabular_data\"])" + ], + "execution_count": null, + "outputs": [], + "id": "97a45b6b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's use `describe_test()` again to retrieve more information about the test, including confirmation that it accepts some additional parameters, such as `min_percent_threshold` which allows you configure the threshold for an acceptable class imbalance:" + ], + "id": "4ba2ec07" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.data_validation.ClassImbalance\")" + ], + "execution_count": null, + "outputs": [], + "id": "ec456cd2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_1__'></a>\n", + "\n", + "#### Log ClassImbalance test with default parameters\n", + "\n", + "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Let's first run the class imbalance test without any parameters to see its output using a default value for the threshold and log the results to the ValidMind Platform for later comparison:" + ], + "id": "e419dd51" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")\n", + "\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "1c137483" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_2__'></a>\n", + "\n", + "#### Log ClassImbalance test with custom paramaters\n", + "\n", + "From the output, we've confirmed that the class imbalance test passes the pass-fail criteria with the default threshold of 10%. Let's try to run the test with a threshold of 20% to see if it fails.\n", + "\n", + "When running individual tests, **you can use a custom `result_id` to tag the individual result with a unique identifier**, allowing you to submit individual results for the same test to the ValidMind Platform:\n", + "\n", + "- This `result_id` can be appended to `test_id` with a `:` separator.\n", + "- The `custom_threshold` identifier will correspond with the results of our adjusted `min_percent_threshold` parameter." + ], + "id": "6cc499de" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance:custom_threshold\",\n", + " inputs={\"dataset\": vm_dataset},\n", + " params={\"min_percent_threshold\": 20},\n", + ")\n", + "\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "2c6f19ad" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When the threshold is set to 20%, the results show that the class imbalance test fails." + ], + "id": "30e82fc3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Work with test results\n", + "\n", + "You can look at the output of tests produced by the ValidMind Library right in this notebook where you ran the tests, as you would expect. But there is a better way — use the ValidMind Platform to attach the logged test results your documentation (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Locate the Data Preparation section and click on **2.1. Data Description** to expand that section.\n", + "\n", + "4. Hover under the logged test block for the default Class Imbalance test until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", + "\n", + "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", + "\n", + " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", + " - Select `ClassImbalance:custom_threshold` as the test.\n", + "\n", + "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", + "\n", + " Confirm that the individual results for the adjusted threshold class imbalance test has been correctly inserted into section **2.1. Data Description** of the documentation.\n", + "\n", + "You just worked with a draft of your model's documentation, in an easily consumable format matching the structure of the template you previewed in the beginning of this notebook. When you connect to a model with the ValidMind Library, logged test results automatically populate for easy insertion into your documentation.\n", + "\n", + "In the ValidMind Platform, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ], + "id": "faa09935" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know the basics of how to run out-of-the-box tests in the ValidMind Library, you’re ready to take the next step. Use `run_test()` with any combination of datasets or records (models) as inputs to run comparison tests, and log your consolidated test results to the ValidMind Platform.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to run comparison tests with the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/2-run_comparison_tests.html\" style=\"color: #DE257E;\"><b>Run comparison tests</b></a> notebook for code examples and usage of key functions.</div>" + ], + "id": "cbe20d76" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "ec08c9bc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "bff625a1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "b5f64e27" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "da29fb9d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "82837a85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-38501808b29c456ab97562eebdd497d4" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index b0f48c167..180d7141a 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -1,1115 +1,1119 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "ed8282aa", - "metadata": {}, - "source": [ - "# Run comparison tests\n", - "\n", - "Learn how to use the ValidMind Library to run comparison tests that take any datasets or records (models) as inputs. Identify comparison tests to run, initialize ValidMind dataset and model objects in preparation for passing them to tests, and then run tests — generating outputs automatically logged to your documentation in the ValidMind Platform.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>We recommend that you first complete our introductory notebook on running tests.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.html\" style=\"color: #DE257E;\"><b>Run dataset-based tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "90ab1b8a", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - "- [Explore a ValidMind test](#toc3__) \n", - "- [Working with ValidMind datasets](#toc4__) \n", - " - [Prepare the sample dataset](#toc4_1__) \n", - " - [Import the sample dataset](#toc4_1_1__) \n", - " - [Split the dataset](#toc4_1_2__) \n", - " - [Initialize the ValidMind dataset](#toc4_2__) \n", - "- [Working with ValidMind models](#toc5__) \n", - " - [Train a sample model](#toc5_1__) \n", - " - [Initialize the ValidMind model](#toc5_2__) \n", - " - [Assign predictions](#toc5_3__) \n", - "- [Running ValidMind tests](#toc6__) \n", - " - [Run classifier performance test with one model](#toc6_1__) \n", - " - [Run comparison tests](#toc6_2__) \n", - " - [Run classifier performance test with multiple models](#toc6_2_1__) \n", - " - [Run classifier performance test with multiple parameter values](#toc6_2_2__) \n", - " - [Run comparison test with multiple datasets](#toc6_2_3__) \n", - "- [Work with test results](#toc7__) \n", - "- [Next steps](#toc8__) \n", - " - [Discover more learning resources](#toc8_1__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "60aa37b6", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "6dfa3d15", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "8e87dd4d", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "64971d85", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "69a40ac3", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "ec35c724", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fc97888f", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b3c0c2f5", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "d3e3302f", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "679d46b2", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "2b6e1fb1", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c51ae01c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "52b68564", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fd332a9d", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "184b8c97", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8e2127cd", - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "c3098355", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Explore a ValidMind test\n", - "\n", - "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", - "\n", - "Let's assume you want to evaluate *classifier performance* for a model. Classifier performance measures how well a classification model correctly predicts outcomes, using metrics like [precision, recall, and F1 score](https://en.wikipedia.org/wiki/Precision_and_recall).\n", - "\n", - "We'll pass in a `filter` to the `list_tests` function to find the test ID for classifier performance:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a6a6f715", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"ClassifierPerformance\")" - ] - }, - { - "cell_type": "markdown", - "id": "d1f08b64", - "metadata": {}, - "source": [ - "We've identified from the output that the test ID for the classifier performance test is `validmind.model_validation.ClassifierPerformance`.\n", - "\n", - "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f8a46c7d", - "metadata": {}, - "outputs": [], - "source": [ - "test_id = \"validmind.model_validation.sklearn.ClassifierPerformance\"\n", - "vm.tests.describe_test(test_id)" - ] - }, - { - "cell_type": "markdown", - "id": "10a49439", - "metadata": {}, - "source": [ - "Since this test requires both a dataset object and a model object, you can expect it to throw an error when we run it without passing in either as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f853c272", - "metadata": {}, - "outputs": [], - "source": [ - "try:\n", - " vm.tests.run_test(test_id)\n", - "except Exception as e:\n", - " print(e)" - ] - }, - { - "cell_type": "markdown", - "id": "da36ba6b", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "40324c13", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "3f28ffe2", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "4c45a55c", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### Import the sample dataset\n", - "\n", - "Since we need a dataset to run tests, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that:\n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3ef2dfbb", - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "2fc43d28", - "metadata": {}, - "source": [ - "<a id='toc4_1_2__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Let's first split our dataset to help assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "88c87d4a", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "a5d77885", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind dataset\n", - "\n", - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bf0ec747", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "cbb1a68f", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "68089f0a", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Train a sample model\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "39e8c7ea", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "6d93642b", - "metadata": {}, - "source": [ - "Next, let's create an *XGBoost classifier model* that will automatically stop training if it doesn't improve after 10 tries. XGBoost is a gradient-boosted tree ensemble that builds trees sequentially, with each tree correcting the errors of the previous ones — typically known for strong predictive performance and built-in regularization to reduce overfitting.\n", - "\n", - "Setting an explicit threshold avoids wasting time and helps prevent further overfitting by stopping training when further improvement isn't happening. We'll also set three evaluation metrics to get a more complete picture of model performance:\n", - "\n", - "1. **error** — Measures how often the model makes incorrect predictions.\n", - "2. **logloss** — Indicates how confident the predictions are.\n", - "3. **auc** — Evaluates how well the model distinguishes between churn and not churn." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "255e3583", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a021582a", - "metadata": {}, - "source": [ - "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", - "\n", - "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", - "- To turn off printed output while training, we'll set `verbose` to `False`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e3aa3657", - "metadata": {}, - "outputs": [], - "source": [ - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "ed11ea0b", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4b2be11f", - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " model,\n", - " input_id=\"xgboost\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "53f12da6", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "229185fd", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "id": "18c1cb2e", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running ValidMind tests\n", - "\n", - "Now that we know how to initialize ValidMind `dataset` and `model` objects, we're ready to run some tests!\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", - "\n", - "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", - "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." - ] - }, - { - "cell_type": "markdown", - "id": "6f7e7779", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Run classifier performance test with one model\n", - "\n", - "Run `validmind.data_validation.ClassifierPerformance` test with the testing dataset (`vm_test_ds`) and model (`vm_model_xgb`) as inputs:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85189af9", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model_xgb,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "5e8be8d5", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run comparison tests\n", - "\n", - "To evaluate which models might be a better fit for a use case based on their performance on selected criteria, we can run the same test with multiple models. We'll train three additional models and run the classifier performance test with for all four models using a single `run_test()` call.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>ValidMind helps streamline your documentation and testing.</b></span>\n", - "<br></br>\n", - "You could call <code>run_test()</code> multiple times passing in different inputs, but you can also pass an <code>input_grid</code> object — a dictionary of test input keys and values that allow you to run a single test for a combination of models and datasets.\n", - "<br></br>\n", - "With <code>input_grid</code>, run comparison tests for multiple datasets, or even multiple datasets and models simultaneously — <code>input_grid</code> can be used with <code>run_test()</code> for all possible combinations of inputs, generating a cohesive and comprehensive single output.\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "e33c7a82", - "metadata": {}, - "source": [ - "*Random forest classifier* models use an ensemble method that builds multiple decision trees and averages their predictions. Random forest is robust to overfitting and handles non-linear relations well, but is typically less interpretable than simpler models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1976b7e8", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "# Train the random forest classifer model\n", - "model_rf = RandomForestClassifier()\n", - "model_rf.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the random forest classifer model\n", - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forest\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the random forest classifer model\n", - "vm_test_ds.assign_predictions(model=vm_model_rf)" - ] - }, - { - "cell_type": "markdown", - "id": "f8e167cf", - "metadata": {}, - "source": [ - "*Logistic regression* models are linear models that estimate class probabilities via a logistic (sigmoid) function. Logistic regression is highly interpretable with fast training, establishing a strong baseline — however, they struggle when relationships are non-linear as real-world relationships often are:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "90bbf148", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "from sklearn.preprocessing import StandardScaler\n", - "from sklearn.pipeline import Pipeline\n", - "\n", - "# Scaling features ensures the lbfgs solver converges reliably\n", - "model_lr = Pipeline([\n", - " (\"scaler\", StandardScaler()),\n", - " (\"lr\", LogisticRegression()),\n", - "])\n", - "model_lr.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the logistic regression model\n", - "vm_model_lr = vm.init_model(\n", - " model_lr,\n", - " input_id=\"logistic_regression\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the logistic regression model\n", - "vm_test_ds.assign_predictions(model=vm_model_lr)" - ] - }, - { - "cell_type": "markdown", - "id": "d3478f86", - "metadata": {}, - "source": [ - "*Decision tree classifier* models are a single tree with data split on feature thresholds. Useful as an explanability benchmark, decision trees are easy to visualize and interpret — but are prone to overfitting without pruning or ensemble techniques:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bfa1e17d", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.tree import DecisionTreeClassifier\n", - "\n", - "# Train the decision tree classifer model\n", - "model_dt = DecisionTreeClassifier()\n", - "model_dt.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the decision tree classifier model\n", - "vm_model_dt = vm.init_model(\n", - " model_dt,\n", - " input_id=\"decision_tree\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the decision tree classifiermodel\n", - "vm_test_ds.assign_predictions(model=vm_model_dt)" - ] - }, - { - "cell_type": "markdown", - "id": "59428da9", - "metadata": {}, - "source": [ - "<a id='toc6_2_1__'></a>\n", - "\n", - "#### Run classifier performance test with multiple models\n", - "\n", - "Now, we'll use the `input_grid` to run the [`ClassifierPerformance` test](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html) on all four models using the testing dataset (`vm_test_ds`).\n", - "\n", - "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier to signify that this test was run on `all_models` to differentiate this test run from other runs:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2e48ce1e", - "metadata": {}, - "outputs": [], - "source": [ - "perf_comparison_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:all_models\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_model_xgb, vm_model_rf, vm_model_lr, vm_model_dt],\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1b76eae0", - "metadata": {}, - "source": [ - "Our output indicates that the XGBoost and random forest classification models provide the strongest overall classification performance, so we'll continue our testing with those two models as input only." - ] - }, - { - "cell_type": "markdown", - "id": "9fcc67b9", - "metadata": {}, - "source": [ - "<a id='toc6_2_2__'></a>\n", - "\n", - "#### Run classifier performance test with multiple parameter values\n", - "\n", - "Next, let's run the classifier performance test with the `param_grid` object, which runs the same test multiple times with different parameter values. We'll append an identifier to signify that this test was run with our `parameter_grid` configuration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d0ad94c9", - "metadata": {}, - "outputs": [], - "source": [ - "parameter_comparison_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:parameter_grid\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_model_xgb,vm_model_rf]\n", - " },\n", - " param_grid={\n", - " \"average\": [\"macro\", \"micro\"]\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "19e8251b", - "metadata": {}, - "source": [ - "<a id='toc6_2_3__'></a>\n", - "\n", - "#### Run comparison test with multiple datasets\n", - "\n", - "Let's also run the [ROCCurve test](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html) using `input_grid` to iterate through multiple datasets, which plots the ROC curves for the training (`vm_train_ds`) and test (`vm_test_ds`) datasets side by side — a common scenario when you want to compare the performance of a model on the training and test datasets and visually assess how much performance is lost in the test dataset.\n", - "\n", - "We'll also need to assign predictions to the training dataset for the random forest classifier model, since we didn't do that in our earlier setup:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "96c3b426", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_rf)" - ] - }, - { - "cell_type": "markdown", - "id": "7e07db9d", - "metadata": {}, - "source": [ - "We'll append an identifier to signify that this test was run with our `train_vs_test` dataset comparison configuration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4056aa1e", - "metadata": {}, - "outputs": [], - "source": [ - "roc_curve_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve:train_vs_test\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model_xgb,vm_model_rf],\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a899fb84", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Work with test results\n", - "\n", - "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform. When logging individual test results to the platform, you'll need to manually add those results to the desired section of the documentation.\n", - "\n", - "You can do this through the ValidMind Platform interface after logging your test results (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)), or directly via the ValidMind Library when calling `.log()` by providing an optional `section_id`. The `section_id` should be a string that matches the title of a section in the documentation template in `snake_case`.\n", - "\n", - "Let's log the results of the classifier performance test (`perf_comparison_result`) and the ROCCurve (`roc_curve_result`) test in the `model_evaluation` section of the documentation — present in the template we previewed in the beginning of this notebook:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e119bf1e", - "metadata": {}, - "outputs": [], - "source": [ - "perf_comparison_result.log(section_id=\"model_evaluation\")\n", - "roc_curve_result.log(section_id=\"model_evaluation\")" - ] - }, - { - "cell_type": "markdown", - "id": "098dba6c", - "metadata": {}, - "source": [ - "Finally, let's head to the model we connected to at the beginning of this notebook and view our inserted test results in the updated documentation (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html)):\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the **3.2. Model Evaluation** section.\n", - "\n", - "4. Confirm that `perf_comparison_result` and `roc_curve_result` display in this section as expected." - ] - }, - { - "cell_type": "markdown", - "id": "a658f908", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know how to run comparison tests with the ValidMind Library, you’re ready to take the next step. Extend the functionality of `run_test()` with your own custom test functions that can be incorporated into documentation templates just like any default out-of-the-box ValidMind test.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to implement custom tests with the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement comparison tests</b></a> notebook for code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "407b6c2b", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "82b51b49", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0d35972c", - "metadata": { - "vscode": { - "languageId": "plaintext" + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run comparison tests\n", + "\n", + "Learn how to use the ValidMind Library to run comparison tests that take any datasets or records (models) as inputs. Identify comparison tests to run, initialize ValidMind dataset and model objects in preparation for passing them to tests, and then run tests — generating outputs automatically logged to your documentation in the ValidMind Platform.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>We recommend that you first complete our introductory notebook on running tests.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.html\" style=\"color: #DE257E;\"><b>Run dataset-based tests</b></a></div>" + ], + "id": "ed8282aa" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + "- [Explore a ValidMind test](#toc3__) \n", + "- [Working with ValidMind datasets](#toc4__) \n", + " - [Prepare the sample dataset](#toc4_1__) \n", + " - [Import the sample dataset](#toc4_1_1__) \n", + " - [Split the dataset](#toc4_1_2__) \n", + " - [Initialize the ValidMind dataset](#toc4_2__) \n", + "- [Working with ValidMind models](#toc5__) \n", + " - [Train a sample model](#toc5_1__) \n", + " - [Initialize the ValidMind model](#toc5_2__) \n", + " - [Assign predictions](#toc5_3__) \n", + "- [Running ValidMind tests](#toc6__) \n", + " - [Run classifier performance test with one model](#toc6_1__) \n", + " - [Run comparison tests](#toc6_2__) \n", + " - [Run classifier performance test with multiple models](#toc6_2_1__) \n", + " - [Run classifier performance test with multiple parameter values](#toc6_2_2__) \n", + " - [Run comparison test with multiple datasets](#toc6_2_3__) \n", + "- [Work with test results](#toc7__) \n", + "- [Next steps](#toc8__) \n", + " - [Discover more learning resources](#toc8_1__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "90ab1b8a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "60aa37b6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "6dfa3d15" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "8e87dd4d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "64971d85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "69a40ac3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "ec35c724" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "fc97888f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "b3c0c2f5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "d3e3302f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "679d46b2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "2b6e1fb1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "c51ae01c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "52b68564" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "fd332a9d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "184b8c97" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [], + "id": "8e2127cd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Explore a ValidMind test\n", + "\n", + "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", + "\n", + "Let's assume you want to evaluate *classifier performance* for a model. Classifier performance measures how well a classification model correctly predicts outcomes, using metrics like [precision, recall, and F1 score](https://en.wikipedia.org/wiki/Precision_and_recall).\n", + "\n", + "We'll pass in a `filter` to the `list_tests` function to find the test ID for classifier performance:" + ], + "id": "c3098355" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"ClassifierPerformance\")" + ], + "execution_count": null, + "outputs": [], + "id": "a6a6f715" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've identified from the output that the test ID for the classifier performance test is `validmind.model_validation.ClassifierPerformance`.\n", + "\n", + "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" + ], + "id": "d1f08b64" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_id = \"validmind.model_validation.sklearn.ClassifierPerformance\"\n", + "vm.tests.describe_test(test_id)" + ], + "execution_count": null, + "outputs": [], + "id": "f8a46c7d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since this test requires both a dataset object and a model object, you can expect it to throw an error when we run it without passing in either as input:" + ], + "id": "10a49439" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "try:\n", + " vm.tests.run_test(test_id)\n", + "except Exception as e:\n", + " print(e)" + ], + "execution_count": null, + "outputs": [], + "id": "f853c272" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" + ], + "id": "da36ba6b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "40324c13" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ], + "id": "3f28ffe2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### Import the sample dataset\n", + "\n", + "Since we need a dataset to run tests, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that:\n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "4c45a55c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "3ef2dfbb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_2__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Let's first split our dataset to help assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ], + "id": "2fc43d28" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [], + "id": "88c87d4a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind dataset\n", + "\n", + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "a5d77885" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "bf0ec747" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind models" + ], + "id": "cbb1a68f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Train a sample model\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" + ], + "id": "68089f0a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "39e8c7ea" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's create an *XGBoost classifier model* that will automatically stop training if it doesn't improve after 10 tries. XGBoost is a gradient-boosted tree ensemble that builds trees sequentially, with each tree correcting the errors of the previous ones — typically known for strong predictive performance and built-in regularization to reduce overfitting.\n", + "\n", + "Setting an explicit threshold avoids wasting time and helps prevent further overfitting by stopping training when further improvement isn't happening. We'll also set three evaluation metrics to get a more complete picture of model performance:\n", + "\n", + "1. **error** — Measures how often the model makes incorrect predictions.\n", + "2. **logloss** — Indicates how confident the predictions are.\n", + "3. **auc** — Evaluates how well the model distinguishes between churn and not churn." + ], + "id": "6d93642b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "255e3583" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", + "\n", + "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", + "- To turn off printed output while training, we'll set `verbose` to `False`." + ], + "id": "a021582a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "e3aa3657" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "ed11ea0b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " model,\n", + " input_id=\"xgboost\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "4b2be11f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "53f12da6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [], + "id": "229185fd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running ValidMind tests\n", + "\n", + "Now that we know how to initialize ValidMind `dataset` and `model` objects, we're ready to run some tests!\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", + "\n", + "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", + "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." + ], + "id": "18c1cb2e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Run classifier performance test with one model\n", + "\n", + "Run `validmind.data_validation.ClassifierPerformance` test with the testing dataset (`vm_test_ds`) and model (`vm_model_xgb`) as inputs:" + ], + "id": "6f7e7779" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model_xgb,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "85189af9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run comparison tests\n", + "\n", + "To evaluate which models might be a better fit for a use case based on their performance on selected criteria, we can run the same test with multiple models. We'll train three additional models and run the classifier performance test with for all four models using a single `run_test()` call.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>ValidMind helps streamline your documentation and testing.</b></span>\n", + "<br></br>\n", + "You could call <code>run_test()</code> multiple times passing in different inputs, but you can also pass an <code>input_grid</code> object — a dictionary of test input keys and values that allow you to run a single test for a combination of models and datasets.\n", + "<br></br>\n", + "With <code>input_grid</code>, run comparison tests for multiple datasets, or even multiple datasets and models simultaneously — <code>input_grid</code> can be used with <code>run_test()</code> for all possible combinations of inputs, generating a cohesive and comprehensive single output.\n", + "</div>" + ], + "id": "5e8be8d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Random forest classifier* models use an ensemble method that builds multiple decision trees and averages their predictions. Random forest is robust to overfitting and handles non-linear relations well, but is typically less interpretable than simpler models:" + ], + "id": "e33c7a82" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "# Train the random forest classifer model\n", + "model_rf = RandomForestClassifier()\n", + "model_rf.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the random forest classifer model\n", + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forest\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the random forest classifer model\n", + "vm_test_ds.assign_predictions(model=vm_model_rf)" + ], + "execution_count": null, + "outputs": [], + "id": "1976b7e8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Logistic regression* models are linear models that estimate class probabilities via a logistic (sigmoid) function. Logistic regression is highly interpretable with fast training, establishing a strong baseline — however, they struggle when relationships are non-linear as real-world relationships often are:" + ], + "id": "f8e167cf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.preprocessing import StandardScaler\n", + "from sklearn.pipeline import Pipeline\n", + "\n", + "# Scaling features ensures the lbfgs solver converges reliably\n", + "model_lr = Pipeline([\n", + " (\"scaler\", StandardScaler()),\n", + " (\"lr\", LogisticRegression()),\n", + "])\n", + "model_lr.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the logistic regression model\n", + "vm_model_lr = vm.init_model(\n", + " model_lr,\n", + " input_id=\"logistic_regression\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the logistic regression model\n", + "vm_test_ds.assign_predictions(model=vm_model_lr)" + ], + "execution_count": null, + "outputs": [], + "id": "90bbf148" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Decision tree classifier* models are a single tree with data split on feature thresholds. Useful as an explanability benchmark, decision trees are easy to visualize and interpret — but are prone to overfitting without pruning or ensemble techniques:" + ], + "id": "d3478f86" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.tree import DecisionTreeClassifier\n", + "\n", + "# Train the decision tree classifer model\n", + "model_dt = DecisionTreeClassifier()\n", + "model_dt.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the decision tree classifier model\n", + "vm_model_dt = vm.init_model(\n", + " model_dt,\n", + " input_id=\"decision_tree\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the decision tree classifiermodel\n", + "vm_test_ds.assign_predictions(model=vm_model_dt)" + ], + "execution_count": null, + "outputs": [], + "id": "bfa1e17d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_1__'></a>\n", + "\n", + "#### Run classifier performance test with multiple models\n", + "\n", + "Now, we'll use the `input_grid` to run the [`ClassifierPerformance` test](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html) on all four models using the testing dataset (`vm_test_ds`).\n", + "\n", + "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier to signify that this test was run on `all_models` to differentiate this test run from other runs:" + ], + "id": "59428da9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "perf_comparison_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:all_models\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_model_xgb, vm_model_rf, vm_model_lr, vm_model_dt],\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "2e48ce1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our output indicates that the XGBoost and random forest classification models provide the strongest overall classification performance, so we'll continue our testing with those two models as input only." + ], + "id": "1b76eae0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_2__'></a>\n", + "\n", + "#### Run classifier performance test with multiple parameter values\n", + "\n", + "Next, let's run the classifier performance test with the `param_grid` object, which runs the same test multiple times with different parameter values. We'll append an identifier to signify that this test was run with our `parameter_grid` configuration:" + ], + "id": "9fcc67b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "parameter_comparison_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:parameter_grid\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_model_xgb,vm_model_rf]\n", + " },\n", + " param_grid={\n", + " \"average\": [\"macro\", \"micro\"]\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d0ad94c9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_3__'></a>\n", + "\n", + "#### Run comparison test with multiple datasets\n", + "\n", + "Let's also run the [ROCCurve test](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html) using `input_grid` to iterate through multiple datasets, which plots the ROC curves for the training (`vm_train_ds`) and test (`vm_test_ds`) datasets side by side — a common scenario when you want to compare the performance of a model on the training and test datasets and visually assess how much performance is lost in the test dataset.\n", + "\n", + "We'll also need to assign predictions to the training dataset for the random forest classifier model, since we didn't do that in our earlier setup:" + ], + "id": "19e8251b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_rf)" + ], + "execution_count": null, + "outputs": [], + "id": "96c3b426" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll append an identifier to signify that this test was run with our `train_vs_test` dataset comparison configuration:" + ], + "id": "7e07db9d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "roc_curve_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve:train_vs_test\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model_xgb,vm_model_rf],\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "4056aa1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Work with test results\n", + "\n", + "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform. When logging individual test results to the platform, you'll need to manually add those results to the desired section of the documentation.\n", + "\n", + "You can do this through the ValidMind Platform interface after logging your test results (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)), or directly via the ValidMind Library when calling `.log()` by providing an optional `section_id`. The `section_id` should be a string that matches the title of a section in the documentation template in `snake_case`.\n", + "\n", + "Let's log the results of the classifier performance test (`perf_comparison_result`) and the ROCCurve (`roc_curve_result`) test in the `model_evaluation` section of the documentation — present in the template we previewed in the beginning of this notebook:" + ], + "id": "a899fb84" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "perf_comparison_result.log(section_id=\"model_evaluation\")\n", + "roc_curve_result.log(section_id=\"model_evaluation\")" + ], + "execution_count": null, + "outputs": [], + "id": "e119bf1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, let's head to the model we connected to at the beginning of this notebook and view our inserted test results in the updated documentation (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html)):\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the **3.2. Model Evaluation** section.\n", + "\n", + "4. Confirm that `perf_comparison_result` and `roc_curve_result` display in this section as expected." + ], + "id": "098dba6c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know how to run comparison tests with the ValidMind Library, you’re ready to take the next step. Extend the functionality of `run_test()` with your own custom test functions that can be incorporated into documentation templates just like any default out-of-the-box ValidMind test.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to implement custom tests with the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement comparison tests</b></a> notebook for code examples and usage of key functions.</div>" + ], + "id": "a658f908" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "407b6c2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "82b51b49" + }, + { + "cell_type": "code", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "0d35972c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "86478a30" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "10073159" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-5fe1b67f8fdc4d26bb090f5e655857bf" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.10" } - }, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "86478a30", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "10073159", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-5fe1b67f8fdc4d26bb090f5e655857bf", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" }, - "language_info": { - "name": "python", - "version": "3.10" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 6d1a02643..210daf5df 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -1,665 +1,669 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "adbd775e", - "metadata": {}, - "source": [ - "# Enable PII detection in tests" - ] - }, - { - "cell_type": "markdown", - "id": "6014f87e", - "metadata": {}, - "source": [ - "Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "id": "b92af62b", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library with PII detection](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Get your code snippet](#toc2_2_1__) \n", - "- [Create a custom test that outputs PII](#toc3__) \n", - "- [Running tests under different PII detection modes](#toc4__) \n", - " - [disabled](#toc4_1__) \n", - " - [test_results](#toc4_2__) \n", - " - [test_descriptions](#toc4_3__) \n", - " - [all](#toc4_4__) \n", - "- [Overriding detection](#toc5__) \n", - " - [Override test result logging](#toc5_1__) \n", - " - [Override test descriptions and test result logging](#toc5_2__) \n", - "- [Review logged test results](#toc6__) \n", - "- [Troubleshooting](#toc7__) \n", - "- [Learn more](#toc8__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "570a178e", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "df929220", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "f626d8bd", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "deb8fd73", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "32293a17", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "6e23f9b2", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library with PII detection\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To use PII detection powered by [Microsoft Presidio](https://microsoft.github.io/presidio/), install the library with the explicit `[pii-detection]` extra specifier:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b830ae91", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q \"validmind[pii-detection]\"" - ] - }, - { - "cell_type": "markdown", - "id": "fa8a1a7d", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library\n", - "\n", - "ValidMind generates a unique _code snippet_ for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook." - ] - }, - { - "cell_type": "markdown", - "id": "3a467dc2", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "eeda4c8c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "82638dab", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Create a custom test that outputs PII\n", - "\n", - "To demonstrate the feature, we'll need a test that outputs PII. First we'll create a custom test that returns:\n", - "\n", - "- A description string containing PII (name, email, phone)\n", - "- A small table containing PII in columns\n", - "\n", - "This output mirrors the structure used in other custom test notebooks and will exercise both table and description PII detection paths. However, if structured detection is unavailable, the library falls back to token-level text scans when possible." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "04d8c802", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "from validmind import test\n", - "\n", - "@test(\"pii_demo.PIIDetection\")\n", - "def pii_custom_test():\n", - " \"\"\"A custom test that returns demo PII.\n", - " This default test description will display when PII is not sent to the LLM to generate test descriptions based on test result data.\"\"\"\n", - " return pd.DataFrame(\n", - " {\n", - " \"name\": [\"Jane Smith\", \"John Doe\", \"Alice Johnson\"],\n", - " \"email\": [\n", - " \"jane.smith@bank.example\",\n", - " \"john.doe@company.example\",\n", - " \"alice.johnson@service.example\",\n", - " ],\n", - " \"phone\": [\"(212) 555-9876\", \"(415) 555-1234\", \"(646) 555-5678\"],\n", - " }\n", - " )" - ] - }, - { - "cell_type": "markdown", - "id": "96878fab", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", - "<br></br>\n", - "Check out our extended introduction to custom tests — <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "0faaceb5", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Running tests under different PII detection modes\n", - "\n", - "Next, let's import [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module to run our custom test via a function called `run_pii_test()` that catches exceptions to observe blocking behavior when PII is present:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b42288e5", - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "from validmind.tests import run_test\n", - "\n", - "# Run test and tag result with unique `result_id`\n", - "def run_pii_test(result_id=\"\"):\n", - " try:\n", - " test_name = f\"pii_demo.PIIDetection:{result_id}\"\n", - " result = run_test(test_name)\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log()\n", - " print(\"No PII detected or detection disabled: Test results logged to the ValidMind Platform\")\n", - " except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "9a6e3398", - "metadata": {}, - "source": [ - "We'll then switch the `VALIDMIND_PII_DETECTION` environment variable across modes in the below examples.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note that since we are running a custom test that does not exist in your model's default documentation template, we'll receive output indicating that a test-driven block doesn't currently exist in your model's documentation for that particular test ID.</b></span>\n", - "<br></br>\n", - "That's expected, as when we run custom tests the results logged need to be manually added to your documentation within the ValidMind Platform or added to your documentation template.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "9801463d", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### disabled\n", - "\n", - "When detection is set to `disabled`, tests run and generate test descriptions. Logging tests with [`.log()`](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) will also send test descriptions and test results to the ValidMind Platform as usual:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3078af64", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: disabled ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"disabled\"\n", - "\n", - "# Run test and tag result with unique ID `disabled`\n", - "run_pii_test(\"disabled\")" - ] - }, - { - "cell_type": "markdown", - "id": "89de78cc", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### test_results\n", - "\n", - "When detection is set for `test_results`, tests run and generate test descriptions for review in your environment, but logging tests will not send descriptions or test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "12e61a80", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_results ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", - "\n", - "# Run test and tag result with unique ID `results_blocked`\n", - "run_pii_test(\"results_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "8fbe427e", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### test_descriptions\n", - "\n", - "When detection is set for `test_descriptions`, tests run but will not generate test descriptions, and logging tests will not send descriptions but will send test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "feba6207", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_descriptions ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_descriptions\"\n", - "\n", - "# Run test and tag result with unique ID `desc_blocked`\n", - "run_pii_test(\"desc_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "0e8950d1", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### all\n", - "\n", - "When detection is set to `all`, tests run will not generate test descriptions or log test results to the ValidMind Platform." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "af5040b5", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: all ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", - "\n", - "# Run test and tag result with unique ID `all_blocked`\n", - "run_pii_test(\"all_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "67240344", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Overriding detection\n", - "\n", - "You can override blocking by passing `unsafe=True` to `result.log(unsafe=True)`, but this is not recommended outside controlled workflows.\n", - "\n", - "To demonstrate, let's rerun our custom test with some override scenarios." - ] - }, - { - "cell_type": "markdown", - "id": "be0510b9", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Override test result logging\n", - "\n", - "First, let's rerun our custom test with detection set to `all`, which will send the test results but not the test descriptions to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0387be21", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: all & unsafe=True ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", - "\n", - "# Run test and tag result with unique ID `override_results`\n", - "try:\n", - " result = run_test(\"pii_demo.PIIDetection:override_results\")\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log(unsafe=True)\n", - " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", - "except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "4e65af32", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Override test descriptions and test result logging\n", - "\n", - "To send both the test descriptions and test results via override, set the `VALIDMIND_PII_DETECTION` environment variable to `test_results` while including the override flag:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b40a2670", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_results & unsafe=True ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", - "\n", - "# Run test and tag result with unique ID `override_both`\n", - "try:\n", - " result = run_test(\"pii_demo.PIIDetection:override_both\")\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected, detection disabled, or override set: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log(unsafe=True)\n", - " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", - "except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "84d6ed78", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Review logged test results\n", - "\n", - "Now let's take a look at the results that were logged to the ValidMind Platform:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Click on any section heading to expand that section to add a new test-driven block. (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html))\n", - "\n", - "4. Under TEST-DRIVEN in the sidebar, click **Custom**.\n", - "\n", - "5. Confirm that you're able to insert the following logged results:\n", - "\n", - " - `pii_demo.PIIDetection:disabled`\n", - " - `pii_demo.PIIDetection:desc_blocked`\n", - " - `pii_demo.PIIDetection:override_results`\n", - " - `pii_demo.PIIDetection:override_both`" - ] - }, - { - "cell_type": "markdown", - "id": "faaa950f", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Troubleshooting\n", - "\n", - "- [x] If you see warnings that Presidio or Presidio analyzer is unavailable, ensure you installed extras: `validmind[pii-detection]`.\n", - "- [x] Ensure your environment is restarted after installing new packages if imports fail." - ] - }, - { - "cell_type": "markdown", - "id": "59c93159", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Learn more\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "8eba96a6", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dffb39a5", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "dbce28c3", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "6225eab3", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-0bc871eca4814e78b16e692e1f2b3209", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "name": "python", - "version": "3.10" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Enable PII detection in tests" + ], + "id": "adbd775e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform." + ], + "id": "6014f87e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library with PII detection](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Get your code snippet](#toc2_2_1__) \n", + "- [Create a custom test that outputs PII](#toc3__) \n", + "- [Running tests under different PII detection modes](#toc4__) \n", + " - [disabled](#toc4_1__) \n", + " - [test_results](#toc4_2__) \n", + " - [test_descriptions](#toc4_3__) \n", + " - [all](#toc4_4__) \n", + "- [Overriding detection](#toc5__) \n", + " - [Override test result logging](#toc5_1__) \n", + " - [Override test descriptions and test result logging](#toc5_2__) \n", + "- [Review logged test results](#toc6__) \n", + "- [Troubleshooting](#toc7__) \n", + "- [Learn more](#toc8__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "b92af62b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "570a178e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "df929220" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "f626d8bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "deb8fd73" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "32293a17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library with PII detection\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To use PII detection powered by [Microsoft Presidio](https://microsoft.github.io/presidio/), install the library with the explicit `[pii-detection]` extra specifier:" + ], + "id": "6e23f9b2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q \"validmind[pii-detection]\"" + ], + "execution_count": null, + "outputs": [], + "id": "b830ae91" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library\n", + "\n", + "ValidMind generates a unique _code snippet_ for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook." + ], + "id": "fa8a1a7d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3a467dc2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "eeda4c8c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Create a custom test that outputs PII\n", + "\n", + "To demonstrate the feature, we'll need a test that outputs PII. First we'll create a custom test that returns:\n", + "\n", + "- A description string containing PII (name, email, phone)\n", + "- A small table containing PII in columns\n", + "\n", + "This output mirrors the structure used in other custom test notebooks and will exercise both table and description PII detection paths. However, if structured detection is unavailable, the library falls back to token-level text scans when possible." + ], + "id": "82638dab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "from validmind import test\n", + "\n", + "@test(\"pii_demo.PIIDetection\")\n", + "def pii_custom_test():\n", + " \"\"\"A custom test that returns demo PII.\n", + " This default test description will display when PII is not sent to the LLM to generate test descriptions based on test result data.\"\"\"\n", + " return pd.DataFrame(\n", + " {\n", + " \"name\": [\"Jane Smith\", \"John Doe\", \"Alice Johnson\"],\n", + " \"email\": [\n", + " \"jane.smith@bank.example\",\n", + " \"john.doe@company.example\",\n", + " \"alice.johnson@service.example\",\n", + " ],\n", + " \"phone\": [\"(212) 555-9876\", \"(415) 555-1234\", \"(646) 555-5678\"],\n", + " }\n", + " )" + ], + "execution_count": null, + "outputs": [], + "id": "04d8c802" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", + "<br></br>\n", + "Check out our extended introduction to custom tests — <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" + ], + "id": "96878fab" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Running tests under different PII detection modes\n", + "\n", + "Next, let's import [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module to run our custom test via a function called `run_pii_test()` that catches exceptions to observe blocking behavior when PII is present:" + ], + "id": "0faaceb5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "from validmind.tests import run_test\n", + "\n", + "# Run test and tag result with unique `result_id`\n", + "def run_pii_test(result_id=\"\"):\n", + " try:\n", + " test_name = f\"pii_demo.PIIDetection:{result_id}\"\n", + " result = run_test(test_name)\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log()\n", + " print(\"No PII detected or detection disabled: Test results logged to the ValidMind Platform\")\n", + " except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "b42288e5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll then switch the `VALIDMIND_PII_DETECTION` environment variable across modes in the below examples.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note that since we are running a custom test that does not exist in your model's default documentation template, we'll receive output indicating that a test-driven block doesn't currently exist in your model's documentation for that particular test ID.</b></span>\n", + "<br></br>\n", + "That's expected, as when we run custom tests the results logged need to be manually added to your documentation within the ValidMind Platform or added to your documentation template.</div>" + ], + "id": "9a6e3398" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### disabled\n", + "\n", + "When detection is set to `disabled`, tests run and generate test descriptions. Logging tests with [`.log()`](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) will also send test descriptions and test results to the ValidMind Platform as usual:" + ], + "id": "9801463d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: disabled ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"disabled\"\n", + "\n", + "# Run test and tag result with unique ID `disabled`\n", + "run_pii_test(\"disabled\")" + ], + "execution_count": null, + "outputs": [], + "id": "3078af64" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### test_results\n", + "\n", + "When detection is set for `test_results`, tests run and generate test descriptions for review in your environment, but logging tests will not send descriptions or test results to the ValidMind Platform:" + ], + "id": "89de78cc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_results ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", + "\n", + "# Run test and tag result with unique ID `results_blocked`\n", + "run_pii_test(\"results_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "12e61a80" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### test_descriptions\n", + "\n", + "When detection is set for `test_descriptions`, tests run but will not generate test descriptions, and logging tests will not send descriptions but will send test results to the ValidMind Platform:" + ], + "id": "8fbe427e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_descriptions ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_descriptions\"\n", + "\n", + "# Run test and tag result with unique ID `desc_blocked`\n", + "run_pii_test(\"desc_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "feba6207" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### all\n", + "\n", + "When detection is set to `all`, tests run will not generate test descriptions or log test results to the ValidMind Platform." + ], + "id": "0e8950d1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: all ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", + "\n", + "# Run test and tag result with unique ID `all_blocked`\n", + "run_pii_test(\"all_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "af5040b5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Overriding detection\n", + "\n", + "You can override blocking by passing `unsafe=True` to `result.log(unsafe=True)`, but this is not recommended outside controlled workflows.\n", + "\n", + "To demonstrate, let's rerun our custom test with some override scenarios." + ], + "id": "67240344" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Override test result logging\n", + "\n", + "First, let's rerun our custom test with detection set to `all`, which will send the test results but not the test descriptions to the ValidMind Platform:" + ], + "id": "be0510b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: all & unsafe=True ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", + "\n", + "# Run test and tag result with unique ID `override_results`\n", + "try:\n", + " result = run_test(\"pii_demo.PIIDetection:override_results\")\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log(unsafe=True)\n", + " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", + "except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "0387be21" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Override test descriptions and test result logging\n", + "\n", + "To send both the test descriptions and test results via override, set the `VALIDMIND_PII_DETECTION` environment variable to `test_results` while including the override flag:" + ], + "id": "4e65af32" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_results & unsafe=True ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", + "\n", + "# Run test and tag result with unique ID `override_both`\n", + "try:\n", + " result = run_test(\"pii_demo.PIIDetection:override_both\")\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected, detection disabled, or override set: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log(unsafe=True)\n", + " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", + "except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "b40a2670" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Review logged test results\n", + "\n", + "Now let's take a look at the results that were logged to the ValidMind Platform:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Click on any section heading to expand that section to add a new test-driven block. (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html))\n", + "\n", + "4. Under TEST-DRIVEN in the sidebar, click **Custom**.\n", + "\n", + "5. Confirm that you're able to insert the following logged results:\n", + "\n", + " - `pii_demo.PIIDetection:disabled`\n", + " - `pii_demo.PIIDetection:desc_blocked`\n", + " - `pii_demo.PIIDetection:override_results`\n", + " - `pii_demo.PIIDetection:override_both`" + ], + "id": "84d6ed78" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Troubleshooting\n", + "\n", + "- [x] If you see warnings that Presidio or Presidio analyzer is unavailable, ensure you installed extras: `validmind[pii-detection]`.\n", + "- [x] Ensure your environment is restarted after installing new packages if imports fail." + ], + "id": "faaa950f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Learn more\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "59c93159" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "8eba96a6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "dffb39a5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "dbce28c3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "6225eab3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-0bc871eca4814e78b16e692e1f2b3209" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 9d4c0d72e..d6d1fa2da 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -1,582 +1,586 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run tests with multiple datasets\n", - "\n", - "To support running tests that require more than one dataset, ValidMind provides a mechanim that allows you to pass multiple datasets as inputs.\n", - "\n", - "<!--- TO DO Check that this explanation is accurate --->\n", - "To ensure a model generalizes well to new, unseen data, it's common to use separate datasets for training, validation, and testing, with each set serving to check the model's performance at different stages of development. Additionally, since models often encounter data from various sources that might differ in distribution, quality, or type, using multiple datasets in testing can simulate this diversity and better prepare the model for deployment.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, initialize ValidMind objects, and run a test that requires multiple datasets." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Prepocess the raw dataset](#toc4__) \n", - "- [Train models for testing](#toc5__) \n", - "- [Initialize ValidMind objects](#toc6__) \n", - " - [Initialize the ValidMind model](#toc6_1__) \n", - " - [Initialize the ValidMind datasets](#toc6_2__) \n", - "- [Run a test that requires multiple datasets](#toc7__) \n", - " - [Run predictions and link with the model](#toc7_1__) \n", - " - [Run test](#toc7_2__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Train models for testing\n", - "\n", - "Initialize XGBoost and Logistic Regression Classifiers" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "import xgboost\n", - "\n", - "%matplotlib inline\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Initialize the ValidMind model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run a test that requires multiple datasets\n", - "\n", - "We are going to show the following in next two blocks:\n", - "\n", - "- Assign predictions for `vm_train_ds` and `vm_test_ds`\n", - "- Run `RobustnessDiagnosis` which is one example test that takes two input datasets" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Run predictions and link with the model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Run test" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\"datasets\": (vm_train_ds, vm_test_ds), \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc8_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc8_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-72af338f140e4a4bad5cb3954201d23e", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run tests with multiple datasets\n", + "\n", + "To support running tests that require more than one dataset, ValidMind provides a mechanim that allows you to pass multiple datasets as inputs.\n", + "\n", + "<!--- TO DO Check that this explanation is accurate --->\n", + "To ensure a model generalizes well to new, unseen data, it's common to use separate datasets for training, validation, and testing, with each set serving to check the model's performance at different stages of development. Additionally, since models often encounter data from various sources that might differ in distribution, quality, or type, using multiple datasets in testing can simulate this diversity and better prepare the model for deployment.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, initialize ValidMind objects, and run a test that requires multiple datasets." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Prepocess the raw dataset](#toc4__) \n", + "- [Train models for testing](#toc5__) \n", + "- [Initialize ValidMind objects](#toc6__) \n", + " - [Initialize the ValidMind model](#toc6_1__) \n", + " - [Initialize the ValidMind datasets](#toc6_2__) \n", + "- [Run a test that requires multiple datasets](#toc7__) \n", + " - [Run predictions and link with the model](#toc7_1__) \n", + " - [Run test](#toc7_2__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Train models for testing\n", + "\n", + "Initialize XGBoost and Logistic Regression Classifiers" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "import xgboost\n", + "\n", + "%matplotlib inline\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Initialize the ValidMind model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run a test that requires multiple datasets\n", + "\n", + "We are going to show the following in next two blocks:\n", + "\n", + "- Assign predictions for `vm_train_ds` and `vm_test_ds`\n", + "- Run `RobustnessDiagnosis` which is one example test that takes two input datasets" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Run predictions and link with the model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Run test" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\"datasets\": (vm_train_ds, vm_test_ds), \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc8_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc8_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-72af338f140e4a4bad5cb3954201d23e" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 93092a7a1..6cd3967a7 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -1,633 +1,639 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document multiple results for the same test\n", - "\n", - "Documentation templates facilitate the presentation of multiple unique test results for a single test. \n", - "\n", - "Consider various scenarios where you may intend to showcase results of the same test with diverse inputs:\n", - "\n", - "- **Comparing test results with varied parameter values:** Illustrate model performance by contrasting test results achieved with different parameter values to identify optimal settings.\n", - "- **Displaying test results with distinct datasets:** Showcase test versatility by presenting results on diverse datasets, such as providing confusion matrices for both training and test data.\n", - "- **Model comparison:** Conduct a comprehensive model evaluation by comparing tests like `ROC curve` and `Accuracy` to discern and select the superior-performing model.\n", - "\n", - "This interactive notebook guides you through the process of documenting a model with the ValidMind Library. It uses the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) sample dataset from Kaggle to train a simple classification model. As part of the notebook, you will learn how to render more than one unique test result for the same test while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple classification model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Update the customer churn demo template](#toc3__) \n", - "- [Initialize the Python environment](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - "- [Load the sample dataset](#toc5__) \n", - " - [Initialize a ValidMind dataset object](#toc5_1__) \n", - "- [Document the model](#toc6__) \n", - " - [Prepare datasets](#toc6_1__) \n", - " - [Initialize the training and test datasets](#toc6_2__) \n", - " - [Run documentation tests](#toc6_3__) \n", - " - [Run the individual tests using the `run_test`](#toc6_4__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your model documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Update the customer churn demo template\n", - "\n", - "Before you initialize the ValidMind Library by running the notebook, edit the **Binary classification** template to make a copy of a test of interest and update it with different `result_id` fields for each entry:\n", - "\n", - "- Go to **Settings > Templates** and click on the **Binary classification** template. Let's say we want to show `Skewness` results for `training` and `test` datasets.\n", - "\n", - "To do this we replace\n", - "\n", - "```yaml\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness\n", - "```\n", - "\n", - "with\n", - "\n", - "```yaml\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness:training_data\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness:test_data\n", - "```\n", - "\n", - "This way, we can show two results of the same test in the model document. Here, the `training_data` and `test_data` could be any string. However, they should be unique for the same test.\n", - "\n", - "- Click on **Prepare new version**, provide some version notes and click on **Save new version** to save a new version of this template.\n", - "- Next, we need to swap our model documentation to use this new version of the template. Follow the steps on [Manage document templates](https://docs.validmind.ai/guide/templates/manage-document-templates.html) to swap the template of our customer churn model.\n", - "\n", - "In the following sections we provide more context on how these `content_id` fields mentioned earlier get mapped to the actual tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import xgboost as xgb\n", - "\n", - "from sklearn.metrics import accuracy_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library, along with a second, different dataset (`taiwan_credit`) you can try as well.\n", - "\n", - "To be able to use either sample dataset, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize a ValidMind dataset object\n", - "\n", - "Before you can run a test suite, which are a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to analyze\n", - "- `target_column` — the name of the target column in the dataset\n", - "- `class_labels` — the list of class labels used for classification model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=df,\n", - " target_column=demo_dataset.target_column,\n", - " class_labels=demo_dataset.class_labels,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Prepare datasets\n", - "\n", - "DataFrame (df) preprocessing is simplified by employing `demo_dataset.preprocess` to partition it into distinct datasets (`train_df`, `validation_df`, and `test_df`)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the training and test datasets\n", - "\n", - "With the datasets ready, you can now initialize the training and test datasets (`train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\", dataset=train_df, target_column=demo_dataset.target_column\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run documentation tests\n", - "\n", - "Now specify `inputs` and `params` for individual tests using `config` parameter. The results for the both the datasets will be visible in the documentation. The `inputs` in the config get priority over global `inputs` in the `run_documentation_tests`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "config = {\n", - " \"validmind.data_validation.Skewness:training_data\": {\n", - " \"params\": {\"max_threshold\": 1},\n", - " \"inputs\": {\"dataset\": vm_train_ds},\n", - " },\n", - " \"validmind.data_validation.Skewness:test_data\": {\n", - " \"params\": {\"max_threshold\": 1.5},\n", - " \"inputs\": {\"dataset\": vm_test_ds},\n", - " },\n", - "}\n", - "\n", - "tests_suite = vm.run_documentation_tests(\n", - " inputs={\n", - " \"dataset\": vm_dataset,\n", - " },\n", - " config=config,\n", - " section=[\"data_preparation\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Run the individual tests using the `run_test`\n", - "\n", - "Now run the `Skewness` tests for training and test datasets. The results for the both the datasets will be visible in the documentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.Skewness:training_data\",\n", - " params={\"max_threshold\": 1},\n", - " inputs={\"dataset\": vm_train_ds},\n", - ")\n", - "test.log()\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.Skewness:test_data\",\n", - " params={\"max_threshold\": 1.5},\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the **2. Data Preparation** section and take a look around.\n", - "\n", - " You can now see the skewness tests results of training and test datasets in the `Data Preparation` section.\n", - "\n", - "From here, you can also make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-6ce412276b6244aab16b2e3443c6a861", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.9", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document multiple results for the same test\n", + "\n", + "Documentation templates facilitate the presentation of multiple unique test results for a single test. \n", + "\n", + "Consider various scenarios where you may intend to showcase results of the same test with diverse inputs:\n", + "\n", + "- **Comparing test results with varied parameter values:** Illustrate model performance by contrasting test results achieved with different parameter values to identify optimal settings.\n", + "- **Displaying test results with distinct datasets:** Showcase test versatility by presenting results on diverse datasets, such as providing confusion matrices for both training and test data.\n", + "- **Model comparison:** Conduct a comprehensive model evaluation by comparing tests like `ROC curve` and `Accuracy` to discern and select the superior-performing model.\n", + "\n", + "This interactive notebook guides you through the process of documenting a model with the ValidMind Library. It uses the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) sample dataset from Kaggle to train a simple classification model. As part of the notebook, you will learn how to render more than one unique test result for the same test while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple classification model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Update the customer churn demo template](#toc3__) \n", + "- [Initialize the Python environment](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + "- [Load the sample dataset](#toc5__) \n", + " - [Initialize a ValidMind dataset object](#toc5_1__) \n", + "- [Document the model](#toc6__) \n", + " - [Prepare datasets](#toc6_1__) \n", + " - [Initialize the training and test datasets](#toc6_2__) \n", + " - [Run documentation tests](#toc6_3__) \n", + " - [Run the individual tests using the `run_test`](#toc6_4__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your model documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Update the customer churn demo template\n", + "\n", + "Before you initialize the ValidMind Library by running the notebook, edit the **Binary classification** template to make a copy of a test of interest and update it with different `result_id` fields for each entry:\n", + "\n", + "- Go to **Settings > Templates** and click on the **Binary classification** template. Let's say we want to show `Skewness` results for `training` and `test` datasets.\n", + "\n", + "To do this we replace\n", + "\n", + "```yaml\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness\n", + "```\n", + "\n", + "with\n", + "\n", + "```yaml\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness:training_data\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness:test_data\n", + "```\n", + "\n", + "This way, we can show two results of the same test in the model document. Here, the `training_data` and `test_data` could be any string. However, they should be unique for the same test.\n", + "\n", + "- Click on **Prepare new version**, provide some version notes and click on **Save new version** to save a new version of this template.\n", + "- Next, we need to swap our model documentation to use this new version of the template. Follow the steps on [Manage document templates](https://docs.validmind.ai/guide/templates/manage-document-templates.html) to swap the template of our customer churn model.\n", + "\n", + "In the following sections we provide more context on how these `content_id` fields mentioned earlier get mapped to the actual tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "import xgboost as xgb\n", + "\n", + "from sklearn.metrics import accuracy_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library, along with a second, different dataset (`taiwan_credit`) you can try as well.\n", + "\n", + "To be able to use either sample dataset, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize a ValidMind dataset object\n", + "\n", + "Before you can run a test suite, which are a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to analyze\n", + "- `target_column` — the name of the target column in the dataset\n", + "- `class_labels` — the list of class labels used for classification model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=df,\n", + " target_column=demo_dataset.target_column,\n", + " class_labels=demo_dataset.class_labels,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Prepare datasets\n", + "\n", + "DataFrame (df) preprocessing is simplified by employing `demo_dataset.preprocess` to partition it into distinct datasets (`train_df`, `validation_df`, and `test_df`)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the training and test datasets\n", + "\n", + "With the datasets ready, you can now initialize the training and test datasets (`train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\", dataset=train_df, target_column=demo_dataset.target_column\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run documentation tests\n", + "\n", + "Now specify `inputs` and `params` for individual tests using `config` parameter. The results for the both the datasets will be visible in the documentation. The `inputs` in the config get priority over global `inputs` in the `run_documentation_tests`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "config = {\n", + " \"validmind.data_validation.Skewness:training_data\": {\n", + " \"params\": {\"max_threshold\": 1},\n", + " \"inputs\": {\"dataset\": vm_train_ds},\n", + " },\n", + " \"validmind.data_validation.Skewness:test_data\": {\n", + " \"params\": {\"max_threshold\": 1.5},\n", + " \"inputs\": {\"dataset\": vm_test_ds},\n", + " },\n", + "}\n", + "\n", + "tests_suite = vm.run_documentation_tests(\n", + " inputs={\n", + " \"dataset\": vm_dataset,\n", + " },\n", + " config=config,\n", + " section=[\"data_preparation\"],\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Run the individual tests using the `run_test`\n", + "\n", + "Now run the `Skewness` tests for training and test datasets. The results for the both the datasets will be visible in the documentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.Skewness:training_data\",\n", + " params={\"max_threshold\": 1},\n", + " inputs={\"dataset\": vm_train_ds},\n", + ")\n", + "test.log()\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.Skewness:test_data\",\n", + " params={\"max_threshold\": 1.5},\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the **2. Data Preparation** section and take a look around.\n", + "\n", + " You can now see the skewness tests results of training and test datasets in the `Data Preparation` section.\n", + "\n", + "From here, you can also make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-6ce412276b6244aab16b2e3443c6a861" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.9", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 56eedd897..4724da3ab 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -1,601 +1,605 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run individual documentation sections\n", - "\n", - "For targeted testing, you can run tests on individual sections or specific groups of sections in your model documentation.\n", - "\n", - "As a model developer, running individual documentation sections is useful in various development scenarios. For instance, when updates are made to a model, often only certain parts of the documentation require revision. The `run_documentation_tests()` function allows you to directly test only these affected sections, thus saving you time and ensuring that the documentation accurately reflects the latest changes.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run the data preparation, model development, and multiple documentation sections." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the Demo Dataset](#toc3__) \n", - " - [Prepocess the raw dataset](#toc3_1__) \n", - "- [Train a model for testing](#toc4__) \n", - "- [Initialize ValidMind objects](#toc5__) \n", - " - [Assign predictions to the datasets](#toc5_1__) \n", - "- [Run the data preparation section](#toc6__) \n", - "- [Run the model development section](#toc7__) \n", - "- [Run multiple model documentation sections](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the Demo Dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# You can also import taiwan_credit like this:\n", - "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the raw dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "We initize the objects required to run test suites using the ValidMind Library." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=df,\n", - " target_column=demo_dataset.target_column,\n", - " class_labels=demo_dataset.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\",\n", - " dataset=test_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_model = vm.init_model(model, input_id=\"model\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Run the data preparation section\n", - "\n", - "In this section, we focus on running the tests within the data preparation section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=\"data_preparation\",\n", - " inputs={\n", - " \"dataset\": vm_dataset,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run the model development section\n", - "\n", - "In this section, we focus on running the tests within the model development section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=\"model_development\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model,\n", - " \"datasets\": (vm_train_ds, vm_test_ds),\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run multiple model documentation sections\n", - "\n", - "This section demonstrates how you can execute both the data preparation and model development sections using `run_documentation_tests()`. After running this function, the tests associated with both sections will be executed, and their corresponding model documentation sections updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=[\"model_development\", \"model_diagnosis\"],\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model,\n", - " \"datasets\": (vm_train_ds, vm_test_ds),\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-f4756a1f66ab49598b696ed86685fcc6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run individual documentation sections\n", + "\n", + "For targeted testing, you can run tests on individual sections or specific groups of sections in your model documentation.\n", + "\n", + "As a model developer, running individual documentation sections is useful in various development scenarios. For instance, when updates are made to a model, often only certain parts of the documentation require revision. The `run_documentation_tests()` function allows you to directly test only these affected sections, thus saving you time and ensuring that the documentation accurately reflects the latest changes.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run the data preparation, model development, and multiple documentation sections." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the Demo Dataset](#toc3__) \n", + " - [Prepocess the raw dataset](#toc3_1__) \n", + "- [Train a model for testing](#toc4__) \n", + "- [Initialize ValidMind objects](#toc5__) \n", + " - [Assign predictions to the datasets](#toc5_1__) \n", + "- [Run the data preparation section](#toc6__) \n", + "- [Run the model development section](#toc7__) \n", + "- [Run multiple model documentation sections](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the Demo Dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# You can also import taiwan_credit like this:\n", + "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the raw dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "We initize the objects required to run test suites using the ValidMind Library." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=df,\n", + " target_column=demo_dataset.target_column,\n", + " class_labels=demo_dataset.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\",\n", + " dataset=test_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_model = vm.init_model(model, input_id=\"model\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Run the data preparation section\n", + "\n", + "In this section, we focus on running the tests within the data preparation section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=\"data_preparation\",\n", + " inputs={\n", + " \"dataset\": vm_dataset,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run the model development section\n", + "\n", + "In this section, we focus on running the tests within the model development section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=\"model_development\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model,\n", + " \"datasets\": (vm_train_ds, vm_test_ds),\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run multiple model documentation sections\n", + "\n", + "This section demonstrates how you can execute both the data preparation and model development sections using `run_documentation_tests()`. After running this function, the tests associated with both sections will be executed, and their corresponding model documentation sections updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=[\"model_development\", \"model_diagnosis\"],\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model,\n", + " \"datasets\": (vm_train_ds, vm_test_ds),\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-f4756a1f66ab49598b696ed86685fcc6" + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} \ No newline at end of file diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index 839df7542..bd01a1439 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -1,734 +1,738 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run documentation tests with custom configurations\n", - "\n", - "When running documentation tests, you can configure inputs and parameters for individual tests by passing a config as a parameter.\n", - "\n", - "As a model developer, configuring individual tests is useful in various models development scenarios. For instance, based on a use case, a model might require changing inputs and/or parameters for certain tests. The `run_documentation_tests()` function allows you to directly configure tests through `config`, thus giving you flexibility to run tests according to your use case.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run documentation tests with custom configurations." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - "- [Prepocess the raw dataset](#toc5__) \n", - "- [Train a model for testing](#toc6__) \n", - "- [Initialize ValidMind objects](#toc7__) \n", - " - [Initialize the ValidMind model](#toc7_1__) \n", - " - [Initialize the ValidMind datasets](#toc7_2__) \n", - " - [Run predictions through `assign_predictions` interface](#toc7_3__) \n", - "- [Run documentation tests](#toc8__) \n", - " - [Preview config](#toc8_1__) \n", - " - [Updating config](#toc8_2__) \n", - " - [Run documentation tests](#toc8_3__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`).\n", - "- Initialize XGBoost classifier: Creates an `XGBClassifier` object with early stopping rounds set to 10.\n", - "- Set evaluation metrics: Specifies metrics for model evaluation as \"error,\" \"logloss,\" and \"auc.\"\n", - "- Fit the model: Trains the model on `x_train` and `y_train` using the validation set `(x_val, y_val)`. Verbose output is disabled." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost\n", - "%matplotlib inline\n", - "\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Initialize ValidMind objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "Before you run tests, you'll need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# FUNCTION ARGUMENTS:\n", - "# model - the model that you want to provide as input to tests\n", - "# input_id - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "\n", - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Similarly, initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=raw_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\",\n", - " dataset=test_df,\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_3__'></a>\n", - "\n", - "### Run predictions through `assign_predictions` interface\n", - "\n", - "We can use `assign_predictions()` to run and assign model predictions to our training and test datasets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run documentation tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Preview config\n", - "\n", - "You can preview the default config for the documentation template using the `vm.get_test_suite().get_default_config()` interface." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "\n", - "model_test_suite = vm.get_test_suite()\n", - "config = model_test_suite.get_default_config()\n", - "print(\"Suite Config: \\n\", json.dumps(config, indent=2))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_2__'></a>\n", - "\n", - "### Updating config\n", - "\n", - "The test configuration can be updated to fit with your use case and requirements" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "config = {\n", - " \"validmind.data_validation.DatasetSplit\": {\n", - " \"inputs\": {\"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_train_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ROCCurve\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_3__'></a>\n", - "\n", - "### Run documentation tests\n", - "\n", - "You can now run all documentation tests and pass an extra `config` parameter that overrides input and parameter configuration for the tests specified in the object." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(\n", - " inputs={\n", - " \"dataset\": vm_raw_ds,\n", - " \"model\": vm_model_xgb,\n", - " },\n", - " config=config,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-d0990f47a72e4eaab065be1540234792", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run documentation tests with custom configurations\n", + "\n", + "When running documentation tests, you can configure inputs and parameters for individual tests by passing a config as a parameter.\n", + "\n", + "As a model developer, configuring individual tests is useful in various models development scenarios. For instance, based on a use case, a model might require changing inputs and/or parameters for certain tests. The `run_documentation_tests()` function allows you to directly configure tests through `config`, thus giving you flexibility to run tests according to your use case.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run documentation tests with custom configurations." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + "- [Prepocess the raw dataset](#toc5__) \n", + "- [Train a model for testing](#toc6__) \n", + "- [Initialize ValidMind objects](#toc7__) \n", + " - [Initialize the ValidMind model](#toc7_1__) \n", + " - [Initialize the ValidMind datasets](#toc7_2__) \n", + " - [Run predictions through `assign_predictions` interface](#toc7_3__) \n", + "- [Run documentation tests](#toc8__) \n", + " - [Preview config](#toc8_1__) \n", + " - [Updating config](#toc8_2__) \n", + " - [Run documentation tests](#toc8_3__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`).\n", + "- Initialize XGBoost classifier: Creates an `XGBClassifier` object with early stopping rounds set to 10.\n", + "- Set evaluation metrics: Specifies metrics for model evaluation as \"error,\" \"logloss,\" and \"auc.\"\n", + "- Fit the model: Trains the model on `x_train` and `y_train` using the validation set `(x_val, y_val)`. Verbose output is disabled." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost\n", + "%matplotlib inline\n", + "\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Initialize ValidMind objects" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "Before you run tests, you'll need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# FUNCTION ARGUMENTS:\n", + "# model - the model that you want to provide as input to tests\n", + "# input_id - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "\n", + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Similarly, initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=raw_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\",\n", + " dataset=test_df,\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_3__'></a>\n", + "\n", + "### Run predictions through `assign_predictions` interface\n", + "\n", + "We can use `assign_predictions()` to run and assign model predictions to our training and test datasets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run documentation tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Preview config\n", + "\n", + "You can preview the default config for the documentation template using the `vm.get_test_suite().get_default_config()` interface." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import json\n", + "\n", + "model_test_suite = vm.get_test_suite()\n", + "config = model_test_suite.get_default_config()\n", + "print(\"Suite Config: \\n\", json.dumps(config, indent=2))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_2__'></a>\n", + "\n", + "### Updating config\n", + "\n", + "The test configuration can be updated to fit with your use case and requirements" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "config = {\n", + " \"validmind.data_validation.DatasetSplit\": {\n", + " \"inputs\": {\"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_train_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ROCCurve\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_3__'></a>\n", + "\n", + "### Run documentation tests\n", + "\n", + "You can now run all documentation tests and pass an extra `config` parameter that overrides input and parameter configuration for the tests specified in the object." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(\n", + " inputs={\n", + " \"dataset\": vm_raw_ds,\n", + " \"model\": vm_model_xgb,\n", + " },\n", + " config=config,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-d0990f47a72e4eaab065be1540234792" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index b2d5c5d28..223f29671 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -1,926 +1,930 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "7b021b0d", - "metadata": {}, - "source": [ - "# Quickstart for documentation\n", - "\n", - "Learn the basics of using ValidMind to document records as part of a development workflow. Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.\n", - "\n", - "To document our model with the ValidMind Library, we'll:\n", - "\n", - "1. Import a sample dataset and preprocess it\n", - "2. Split the datasets and initialize them for use with ValidMind\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run a full suite of tests as defined by our documentation template, which will send the results of those tests to the ValidMind Platform" - ] - }, - { - "cell_type": "markdown", - "id": "167aef58", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Initialize the Python environment](#toc3_3__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - " - [View documentation in the ValidMind Platform](#toc4_2__) \n", - "- [Working with ValidMind datasets](#toc5__) \n", - " - [Prepare the sample dataset](#toc5_1__) \n", - " - [Import the sample dataset](#toc5_1_1__) \n", - " - [Preprocess the raw dataset](#toc5_1_2__) \n", - " - [Split the dataset](#toc5_1_3__) \n", - " - [Separate features and targets](#toc5_1_4__) \n", - " - [Initialize the ValidMind datasets](#toc5_2__) \n", - "- [Working with ValidMind models](#toc6__) \n", - " - [Train an XGBoost classifier model](#toc6_1__) \n", - " - [Set evaluation metrics](#toc6_1_1__) \n", - " - [Fit the model](#toc6_1_2__) \n", - " - [Initialize the ValidMind model](#toc6_2__) \n", - " - [Assign predictions](#toc6_3__) \n", - "- [Run a ValidMind test suite](#toc7__) \n", - "- [In summary](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "1cce526f", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "f9b5eac2", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "650236de", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "b9d9d4cf", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "59b308f7", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "61b5cbeb", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "0f08166e", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d1f6dbed", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "1bf4e4cb", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "cb6e369b", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "7167d002", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "43037f46", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2c1dd22", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1a6933d3", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", - "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62d7c2c1", - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "fafe8fc2", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "d7ee565f", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b2bce375", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "fa0e43cb", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### View documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "9d0d1005", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "1b94e39f", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "6fc79fc1", - "metadata": {}, - "source": [ - "<a id='toc5_1_1__'></a>\n", - "\n", - "#### Import the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that: \n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "58d1c94b", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "4fe0f216", - "metadata": {}, - "source": [ - "<a id='toc5_1_2__'></a>\n", - "\n", - "#### Preprocess the raw dataset\n", - "\n", - "Before running tests with ValidMind, we'll need to preprocess our imported dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs)." - ] - }, - { - "cell_type": "markdown", - "id": "9f690a04", - "metadata": {}, - "source": [ - "<a id='toc5_1_3__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "418cb5aa", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "a9ad2104", - "metadata": {}, - "source": [ - "<a id='toc5_1_4__'></a>\n", - "\n", - "#### Separate features and targets\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6fd365fd", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "73d767d7", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- **`class_labels`** — An optional value to map predicted classes to class labels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb6ad06a", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0b33afca", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "5962362c", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Train an XGBoost classifier model\n", - "\n", - "Next, let's create an XGBoost classifier model that will automatically stop training if it doesn’t improve after 10 tries.\n", - "\n", - "Setting a threshold avoids wasting time and helps prevent overfitting by stopping training when further improvement isn’t happening." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3296cac6", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)" - ] - }, - { - "cell_type": "markdown", - "id": "33cafbcf", - "metadata": {}, - "source": [ - "<a id='toc6_1_1__'></a>\n", - "\n", - "#### Set evaluation metrics\n", - "\n", - "Then, we'll set the evaluation metrics, which tells the model to use three different ways to measure its performance:\n", - "\n", - "1. **error** — Measures how often the model makes incorrect predictions.\n", - "2. **logloss** — Indicates how confident the predictions are.\n", - "3. **auc** — Evaluates how well the model distinguishes between churn and not churn.\n", - "\n", - "Using multiple metrics gives a more complete picture of how good (or bad) the model is." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "32d3c3f4", - "metadata": {}, - "outputs": [], - "source": [ - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "47d84a80", - "metadata": {}, - "source": [ - "<a id='toc6_1_2__'></a>\n", - "\n", - "#### Fit the model\n", - "\n", - "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", - "\n", - "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", - "- To turn off printed output while training, we'll set `verbose` to `False`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fb95ce4", - "metadata": {}, - "outputs": [], - "source": [ - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "23bccb27", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e44eebd", - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "20c008bf", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62bd94fc", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0e66a7cd", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run a ValidMind test suite\n", - "\n", - "This is where it all comes together — you are now ready to **run the documentation tests for the model as defined by the documentation template** you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform:\n", - "\n", - "- The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. \n", - "- It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - " ```python\n", - " config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - " }\n", - " ```\n", - "\n", - " Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b3d6741b", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "7eebd40f", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests.\n", - "\n", - "The variable `full_suite` then holds the result of these tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae3accf7", - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "ed61fa23", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the documentation template for your model\n", - "- [x] Import a sample dataset\n", - "- [x] Initialize ValidMind datasets and model objects\n", - "- [x] Assign model predictions to your ValidMind model objects\n", - "- [x] Run a full suite of documentation tests" - ] - }, - { - "cell_type": "markdown", - "id": "68803cd9", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "ba38b729", - "metadata": {}, - "source": [ - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "ae046dc4", - "metadata": {}, - "source": [ - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", - "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "4ce38015", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "35955b6b", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "f865e64e", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "65b36aa7", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-bd87da591b88473997979690dbffcfa5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.12.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for documentation\n", + "\n", + "Learn the basics of using ValidMind to document records as part of a development workflow. Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.\n", + "\n", + "To document our model with the ValidMind Library, we'll:\n", + "\n", + "1. Import a sample dataset and preprocess it\n", + "2. Split the datasets and initialize them for use with ValidMind\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run a full suite of tests as defined by our documentation template, which will send the results of those tests to the ValidMind Platform" + ], + "id": "7b021b0d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Initialize the Python environment](#toc3_3__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + " - [View documentation in the ValidMind Platform](#toc4_2__) \n", + "- [Working with ValidMind datasets](#toc5__) \n", + " - [Prepare the sample dataset](#toc5_1__) \n", + " - [Import the sample dataset](#toc5_1_1__) \n", + " - [Preprocess the raw dataset](#toc5_1_2__) \n", + " - [Split the dataset](#toc5_1_3__) \n", + " - [Separate features and targets](#toc5_1_4__) \n", + " - [Initialize the ValidMind datasets](#toc5_2__) \n", + "- [Working with ValidMind models](#toc6__) \n", + " - [Train an XGBoost classifier model](#toc6_1__) \n", + " - [Set evaluation metrics](#toc6_1_1__) \n", + " - [Fit the model](#toc6_1_2__) \n", + " - [Initialize the ValidMind model](#toc6_2__) \n", + " - [Assign predictions](#toc6_3__) \n", + "- [Run a ValidMind test suite](#toc7__) \n", + "- [In summary](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "167aef58" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "1cce526f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "f9b5eac2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "650236de" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "b9d9d4cf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "59b308f7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "61b5cbeb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "0f08166e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "d1f6dbed" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "1bf4e4cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "cb6e369b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "7167d002" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "43037f46" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "e2c1dd22" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", + "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ], + "id": "1a6933d3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [], + "id": "62d7c2c1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "fafe8fc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "d7ee565f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "b2bce375" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### View documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ], + "id": "fa0e43cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "9d0d1005" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ], + "id": "1b94e39f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_1__'></a>\n", + "\n", + "#### Import the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that: \n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "6fc79fc1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "58d1c94b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_2__'></a>\n", + "\n", + "#### Preprocess the raw dataset\n", + "\n", + "Before running tests with ValidMind, we'll need to preprocess our imported dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs)." + ], + "id": "4fe0f216" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_3__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ], + "id": "9f690a04" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [], + "id": "418cb5aa" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_4__'></a>\n", + "\n", + "#### Separate features and targets\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" + ], + "id": "a9ad2104" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "6fd365fd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- **`class_labels`** — An optional value to map predicted classes to class labels." + ], + "id": "73d767d7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "bb6ad06a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Working with ValidMind models" + ], + "id": "0b33afca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Train an XGBoost classifier model\n", + "\n", + "Next, let's create an XGBoost classifier model that will automatically stop training if it doesn’t improve after 10 tries.\n", + "\n", + "Setting a threshold avoids wasting time and helps prevent overfitting by stopping training when further improvement isn’t happening." + ], + "id": "5962362c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)" + ], + "execution_count": null, + "outputs": [], + "id": "3296cac6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1_1__'></a>\n", + "\n", + "#### Set evaluation metrics\n", + "\n", + "Then, we'll set the evaluation metrics, which tells the model to use three different ways to measure its performance:\n", + "\n", + "1. **error** — Measures how often the model makes incorrect predictions.\n", + "2. **logloss** — Indicates how confident the predictions are.\n", + "3. **auc** — Evaluates how well the model distinguishes between churn and not churn.\n", + "\n", + "Using multiple metrics gives a more complete picture of how good (or bad) the model is." + ], + "id": "33cafbcf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "32d3c3f4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1_2__'></a>\n", + "\n", + "#### Fit the model\n", + "\n", + "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", + "\n", + "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", + "- To turn off printed output while training, we'll set `verbose` to `False`." + ], + "id": "47d84a80" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "3fb95ce4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "23bccb27" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0e44eebd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "20c008bf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "62bd94fc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run a ValidMind test suite\n", + "\n", + "This is where it all comes together — you are now ready to **run the documentation tests for the model as defined by the documentation template** you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform:\n", + "\n", + "- The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. \n", + "- It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + " ```python\n", + " config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + " }\n", + " ```\n", + "\n", + " Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ], + "id": "0e66a7cd" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [], + "id": "b3d6741b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests.\n", + "\n", + "The variable `full_suite` then holds the result of these tests:" + ], + "id": "7eebd40f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [], + "id": "ae3accf7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the documentation template for your model\n", + "- [x] Import a sample dataset\n", + "- [x] Initialize ValidMind datasets and model objects\n", + "- [x] Assign model predictions to your ValidMind model objects\n", + "- [x] Run a full suite of documentation tests" + ], + "id": "ed61fa23" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ], + "id": "68803cd9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ], + "id": "ba38b729" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", + "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "ae046dc4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "4ce38015" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "35955b6b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "f865e64e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "65b36aa7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-bd87da591b88473997979690dbffcfa5" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.12.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index 6b06349ba..5bed58220 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -1,82 +1,86 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "about-intro", - "metadata": {}, - "source": [ - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "about-intro" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "about-begin" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "about-signup" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "about-concepts" + } + ], + "metadata": { + "language_info": { + "name": "python" + } }, - { - "cell_type": "markdown", - "id": "about-begin", - "metadata": {}, - "source": [ - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "about-signup", - "metadata": {}, - "source": [ - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "about-concepts", - "metadata": {}, - "source": [ - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - } - ], - "metadata": { - "language_info": { - "name": "python" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index 0c8316c27..2396ea5cf 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -1,477 +1,481 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "3bd9bc41", - "metadata": {}, - "source": [ - "# ValidMind for development 1 — Set up the ValidMind Library\n", - "\n", - "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", - "<br></br>\n", - "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Developer Fundamentals</b></a></div>" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for development 1 — Set up the ValidMind Library\n", + "\n", + "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", + "<br></br>\n", + "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Developer Fundamentals</b></a></div>" + ], + "id": "3bd9bc41" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the model development process](#toc7_1__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "b4b7c002" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "7b7de259" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "b68b9958" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "3b520a7e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "9b3108db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "f97d4266" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "bf5cd6c2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "95bf9e4b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "827eb6bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library\n", + "\n", + "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." + ], + "id": "ad74254d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "a48cd34d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "8ad7e39a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3339f683" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "a58d951f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "61a021f3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "852db20d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "819a40bc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### View documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ], + "id": "65ed2873" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", + "\n", + "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" + ], + "id": "cdbb94d2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "7ccc7776" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "786f0d9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "f5d3216d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "d2010ad4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "b637c5c6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the documentation template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "dfef8925" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps" + ], + "id": "186bee4f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Start the development process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" + ], + "id": "7dbb07a1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-63fcb66be39b42d38ad874a72a66581b" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "b4b7c002", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the model development process](#toc7_1__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "7b7de259", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "b68b9958", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "3b520a7e", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "9b3108db", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "f97d4266", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "bf5cd6c2", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "95bf9e4b", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "827eb6bd", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "ad74254d", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library\n", - "\n", - "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." - ] - }, - { - "cell_type": "markdown", - "id": "a48cd34d", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "8ad7e39a", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "3339f683", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a58d951f", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "61a021f3", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "852db20d", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "819a40bc", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "65ed2873", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### View documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "cdbb94d2", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", - "\n", - "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7ccc7776", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "786f0d9c", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f5d3216d", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "d2010ad4", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "b637c5c6", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "dfef8925", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the documentation template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "186bee4f", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "id": "7dbb07a1", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Start the development process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-63fcb66be39b42d38ad874a72a66581b", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index 66dfced93..690c9acf6 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -1,2190 +1,2194 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "eee6b64c", - "metadata": {}, - "source": [ - "# Document an agentic AI system\n", - "\n", - "Build and document an agentic AI system with the ValidMind Library. Construct a LangGraph-based banking agent, assign AI evaluation metric scores to your agent, and run accuracy, RAGAS, and safety tests, then log those test results to the ValidMind Platform.\n", - "\n", - "An _AI agent_ is an autonomous system that interprets inputs, selects from available tools or actions, and executes multi-step behaviors to achieve defined goals. In this notebook, the agent acts as a banking assistant that analyzes user requests and automatically selects and invokes the appropriate specialized banking tool to deliver accurate, compliant, and actionable responses.\n", - "\n", - "- This agent enables financial institutions to automate complex banking workflows where different customer requests require different specialized tools and knowledge bases.\n", - "- Effective validation of agentic AI systems reduces the risks of agents misinterpreting inputs, failing to extract required parameters, or producing incorrect assessments or actions — such as selecting the wrong tool.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For the LLM components in this notebook to function properly, you'll need access to OpenAI.</b></span>\n", - "<br></br>\n", - "Before you continue, ensure that a valid <code>OPENAI_API_KEY</code> is set in your <code>.env</code> file.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "30927b2b", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_2_4__) \n", - " - [Verify OpenAI API access](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - "- [Building the LangGraph agent](#toc3__) \n", - " - [Test available banking tools](#toc3_1__) \n", - " - [Create LangGraph banking agent](#toc3_2__) \n", - " - [Define system prompt](#toc3_2_1__) \n", - " - [Initialize the LLM](#toc3_2_2__) \n", - " - [Define agent state structure](#toc3_2_3__) \n", - " - [Create agent workflow function](#toc3_2_4__) \n", - " - [Instantiate the banking agent](#toc3_2_5__) \n", - " - [Integrate agent with ValidMind](#toc3_3__) \n", - " - [Import ValidMind components](#toc3_3_1__) \n", - " - [Create agent wrapper function](#toc3_3_2__) \n", - " - [Initialize the ValidMind model object](#toc3_3_3__) \n", - " - [Store the agent reference](#toc3_3_4__) \n", - " - [Verify integration](#toc3_3_5__) \n", - " - [Validate the system prompt](#toc3_4__) \n", - "- [Initializing the ValidMind dataset](#toc4__) \n", - " - [Assign predictions](#toc4_1__) \n", - "- [Running accuracy tests](#toc5__) \n", - " - [Response accuracy test](#toc5_1__) \n", - " - [Tool selection accuracy test](#toc5_2__) \n", - "- [Assigning AI evaluation metric scores](#toc6__) \n", - " - [Identify relevant DeepEval scorers](#toc6_1__) \n", - " - [Assign reasoning scores](#toc6_2__) \n", - " - [Plan quality score](#toc6_2_1__) \n", - " - [Plan adherence score](#toc6_2_2__) \n", - " - [Assign action scores](#toc6_3__) \n", - " - [Tool correctness score](#toc6_3_1__) \n", - " - [Argument correctness score](#toc6_3_2__) \n", - " - [Assign execution score](#toc6_4__) \n", - " - [Task completion score](#toc6_4_1__) \n", - "- [Running RAGAS tests](#toc7__) \n", - " - [Identify relevant RAGAS tests](#toc7_1__) \n", - " - [Faithfulness](#toc7_1_1__) \n", - " - [Response Relevancy](#toc7_1_2__) \n", - " - [Context Recall](#toc7_1_3__) \n", - "- [Running safety tests](#toc8__) \n", - " - [AspectCritic](#toc8_1_1__) \n", - " - [Bias](#toc8_1_2__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Customize the banking agent for your use case](#toc9_2__) \n", - " - [Discover more learning resources](#toc9_3__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "b58139db", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "7e30d36b", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "1cba586e", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "5c46f003", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "11a2d7a5", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "fbab0edf", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.9 <= x <= 3.14</div>\n", - "\n", - "Let's begin by installing the ValidMind Library with large language model (LLM) support:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1982a118", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q \"validmind[llm]\" \"langgraph==0.3.21\"" - ] - }, - { - "cell_type": "markdown", - "id": "14578e26", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "83d47d89", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "bb2c5670", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Agentic AI`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "98e475c1", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"agentic_ai_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "0d1a13ca", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d6ccbefc", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "3605df4f", - "metadata": {}, - "source": [ - "<a id='toc2_2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dffdaa6f", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "d467c1d2", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Verify OpenAI API access\n", - "\n", - "Verify that a valid `OPENAI_API_KEY` is set in your `.env` file:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "22cc39cb", - "metadata": {}, - "outputs": [], - "source": [ - "# Load environment variables if using .env file\n", - "try:\n", - " from dotenv import load_dotenv\n", - " load_dotenv()\n", - "except ImportError:\n", - " print(\"dotenv not installed. Make sure OPENAI_API_KEY is set in your environment.\")" - ] - }, - { - "cell_type": "markdown", - "id": "b56c3f39", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Let's import all the necessary libraries to prepare for building our banking LangGraph agentic system:\n", - "\n", - "- **Standard libraries** for data handling and environment management.\n", - "- **pandas**, a Python library for data manipulation and analytics, as an alias. We'll also configure pandas to show all columns and all rows at full width for easier debugging and inspection.\n", - "- **LangChain** components for LLM integration and tool management.\n", - "- **LangGraph** for building stateful, multi-step agent workflows.\n", - "- **Banking tools** for specialized financial services as defined in [banking_tools.py](banking_tools.py)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2058d1ac", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import TypedDict, Annotated, Sequence\n", - "\n", - "from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage\n", - "from langchain_openai import ChatOpenAI\n", - "from langgraph.checkpoint.memory import MemorySaver\n", - "from langgraph.graph import StateGraph, END, START\n", - "from langgraph.graph.message import add_messages\n", - "from langgraph.prebuilt import ToolNode\n", - "\n", - "# LOCAL IMPORTS FROM banking_tools.py\n", - "from banking_tools import AVAILABLE_TOOLS\n", - "\n", - "import pandas as pd\n", - "# Configure pandas to show all columns and all rows at full width\n", - "pd.set_option('display.max_columns', None)\n", - "pd.set_option('display.max_colwidth', None)\n", - "pd.set_option('display.width', None)\n", - "pd.set_option('display.max_rows', None)" - ] - }, - { - "cell_type": "markdown", - "id": "cc1d3265", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Building the LangGraph agent" - ] - }, - { - "cell_type": "markdown", - "id": "a3c421c4", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Test available banking tools\n", - "\n", - "We'll use the demo banking tools defined in `banking_tools.py` that provide use cases of financial services:\n", - "\n", - "- **Credit Risk Analyzer** - Loan applications and credit decisions\n", - "- **Customer Account Manager** - Account services and customer support\n", - "- **Fraud Detection System** - Security and fraud prevention" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e0a120c", - "metadata": {}, - "outputs": [], - "source": [ - "print(f\"Available tools: {len(AVAILABLE_TOOLS)}\")\n", - "print(\"\\nTool Details:\")\n", - "for i, tool in enumerate(AVAILABLE_TOOLS, 1):\n", - " print(f\" - {tool.name}\")" - ] - }, - { - "cell_type": "markdown", - "id": "53906630", - "metadata": {}, - "source": [ - "Let's test each banking tool individually to ensure they're working correctly before integrating them into our agent:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dc0caff2", - "metadata": {}, - "outputs": [], - "source": [ - "# Test 1: Credit Risk Analyzer\n", - "print(\"TEST 1: Credit Risk Analyzer\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " # Access the underlying function using .func\n", - " credit_result = AVAILABLE_TOOLS[0].func(\n", - " customer_income=75000,\n", - " customer_debt=1200,\n", - " credit_score=720,\n", - " loan_amount=50000,\n", - " loan_type=\"personal\"\n", - " )\n", - " print(credit_result)\n", - " print(\"Credit Risk Analyzer test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Credit Risk Analyzer test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b6b227db", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Test 2: Customer Account Manager\n", - "print(\"TEST 2: Customer Account Manager\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " # Test checking balance\n", - " account_result = AVAILABLE_TOOLS[1].func(\n", - " account_type=\"checking\",\n", - " customer_id=\"12345\",\n", - " action=\"check_balance\"\n", - " )\n", - " print(account_result)\n", - "\n", - " # Test getting account info\n", - " info_result = AVAILABLE_TOOLS[1].func(\n", - " account_type=\"all\",\n", - " customer_id=\"12345\", \n", - " action=\"get_info\"\n", - " )\n", - " print(info_result)\n", - " print(\"Customer Account Manager test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Customer Account Manager test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a983b30d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Test 3: Fraud Detection System\n", - "print(\"TEST 3: Fraud Detection System\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " fraud_result = AVAILABLE_TOOLS[2].func(\n", - " transaction_id=\"TX123\",\n", - " customer_id=\"12345\",\n", - " transaction_amount=500.00,\n", - " transaction_type=\"withdrawal\",\n", - " location=\"Miami, FL\",\n", - " device_id=\"DEVICE_001\"\n", - " )\n", - " print(fraud_result)\n", - " print(\"Fraud Detection System test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Fraud Detection System test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "markdown", - "id": "1424baed", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Create LangGraph banking agent\n", - "\n", - "With our tools ready to go, we'll create our intelligent banking agent with LangGraph that automatically selects and uses the appropriate banking tool based on a user request." - ] - }, - { - "cell_type": "markdown", - "id": "3469d656", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Define system prompt\n", - "\n", - "We'll begin by defining our system prompt, which provides the LLM with context about its role as a banking assistant and guidance on when to use each available tool:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7971c427", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Enhanced banking system prompt with tool selection guidance\n", - "system_context = \"\"\"You are a professional banking AI assistant with access to specialized banking tools.\n", - " Analyze the user's banking request and directly use the most appropriate tools to help them.\n", - " \n", - " AVAILABLE BANKING TOOLS:\n", - " \n", - " credit_risk_analyzer - Analyze credit risk for loan applications and credit decisions\n", - " - Use for: loan applications, credit assessments, risk analysis, mortgage eligibility\n", - " - Examples: \"Analyze credit risk for $50k personal loan\", \"Assess mortgage eligibility for $300k home purchase\"\n", - " - Parameters: customer_income, customer_debt, credit_score, loan_amount, loan_type\n", - "\n", - " customer_account_manager - Manage customer accounts and provide banking services\n", - " - Use for: account information, transaction processing, product recommendations, customer service\n", - " - Examples: \"Check balance for checking account 12345\", \"Recommend products for customer with high balance\"\n", - " - Parameters: account_type, customer_id, action, amount, account_details\n", - "\n", - " fraud_detection_system - Analyze transactions for potential fraud and security risks\n", - " - Use for: transaction monitoring, fraud prevention, risk assessment, security alerts\n", - " - Examples: \"Analyze fraud risk for $500 ATM withdrawal in Miami\", \"Check security for $2000 online purchase\"\n", - " - Parameters: transaction_id, customer_id, transaction_amount, transaction_type, location, device_id\n", - "\n", - " BANKING INSTRUCTIONS:\n", - " - Analyze the user's banking request carefully and identify the primary need\n", - " - If they need credit analysis → use credit_risk_analyzer\n", - " - If they need financial calculations → use financial_calculator\n", - " - If they need account services → use customer_account_manager\n", - " - If they need security analysis → use fraud_detection_system\n", - " - Extract relevant parameters from the user's request\n", - " - Provide helpful, accurate banking responses based on tool outputs\n", - " - Always consider banking regulations, risk management, and best practices\n", - " - Be professional and thorough in your analysis\n", - "\n", - " Choose and use tools wisely to provide the most helpful banking assistance.\n", - " Describe the response in user friendly manner with details describing the tool output. \n", - " Provide the response in at least 500 words.\n", - " Generate a concise execution plan for the banking request.\n", - " \"\"\"" - ] - }, - { - "cell_type": "markdown", - "id": "b66c1ac4", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Initialize the LLM\n", - "\n", - "Let's initialize the LLM that will power our banking agent:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "866066e7", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the main LLM for banking responses\n", - "main_llm = ChatOpenAI(\n", - " model=\"gpt-5-mini\",\n", - " reasoning={\n", - " \"effort\": \"low\",\n", - " \"summary\": \"auto\"\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "8220afd6", - "metadata": {}, - "source": [ - "Then bind the available banking tools to the LLM, enabling the model to automatically recognize and invoke each tool when appropriate based on request input and the system prompt we defined above:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "906d8132", - "metadata": {}, - "outputs": [], - "source": [ - "# Bind all banking tools to the main LLM\n", - "llm_with_tools = main_llm.bind_tools(AVAILABLE_TOOLS)" - ] - }, - { - "cell_type": "markdown", - "id": "43f56651", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Define agent state structure\n", - "\n", - "The agent state defines the data structure that flows through the LangGraph workflow. It includes:\n", - "\n", - "- **messages** — The conversation history between the user and agent\n", - "- **user_input** — The current user request\n", - "- **session_id** — A unique identifier for the conversation session\n", - "- **context** — Additional context that can be passed between nodes\n", - "\n", - "Defining this state structure maintains the structure throughout the agent's execution and allows for multi-turn conversations with memory:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6b926ddf", - "metadata": {}, - "outputs": [], - "source": [ - "# Banking Agent State Definition\n", - "class BankingAgentState(TypedDict):\n", - " messages: Annotated[Sequence[BaseMessage], add_messages]\n", - " user_input: str\n", - " session_id: str\n", - " context: dict" - ] - }, - { - "cell_type": "markdown", - "id": "387ba780", - "metadata": {}, - "source": [ - "<a id='toc3_2_4__'></a>\n", - "\n", - "#### Create agent workflow function\n", - "\n", - "We'll build the LangGraph agent workflow with two main components:\n", - "\n", - "1. **LLM node** — Processes user requests, applies the system prompt, and decides whether to use tools.\n", - "2. **Tools node** — Executes the selected banking tools when the LLM determines they're needed.\n", - "\n", - "The workflow begins with the LLM analyzing the request, then uses tools if needed — or ends if the response is complete, and finally returns to the LLM to generate the final response." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c9bf585", - "metadata": {}, - "outputs": [], - "source": [ - "def create_banking_langgraph_agent():\n", - " \"\"\"Create a comprehensive LangGraph banking agent with intelligent tool selection.\"\"\"\n", - " def llm_node(state: BankingAgentState) -> BankingAgentState:\n", - " \"\"\"Main LLM node that processes banking requests and selects appropriate tools.\"\"\"\n", - " messages = state[\"messages\"]\n", - " # Add system context to messages\n", - " enhanced_messages = [SystemMessage(content=system_context)] + list(messages)\n", - " # Get LLM response with tool selection\n", - " response = llm_with_tools.invoke(enhanced_messages)\n", - " return {\n", - " **state,\n", - " \"messages\": messages + [response]\n", - " }\n", - " \n", - " def should_continue(state: BankingAgentState) -> str:\n", - " \"\"\"Decide whether to use tools or end the conversation.\"\"\"\n", - " last_message = state[\"messages\"][-1]\n", - " # Check if the LLM wants to use tools\n", - " if hasattr(last_message, 'tool_calls') and last_message.tool_calls:\n", - " return \"tools\"\n", - " return END\n", - " \n", - " # Create the banking state graph\n", - " workflow = StateGraph(BankingAgentState)\n", - " # Add nodes\n", - " workflow.add_node(\"llm\", llm_node)\n", - " workflow.add_node(\"tools\", ToolNode(AVAILABLE_TOOLS))\n", - " # Simplified entry point - go directly to LLM\n", - " workflow.add_edge(START, \"llm\")\n", - " # From LLM, decide whether to use tools or end\n", - " workflow.add_conditional_edges(\n", - " \"llm\",\n", - " should_continue,\n", - " {\"tools\": \"tools\", END: END}\n", - " )\n", - " # Tool execution flows back to LLM for final response\n", - " workflow.add_edge(\"tools\", \"llm\")\n", - " # Set up memory\n", - " memory = MemorySaver()\n", - " # Compile the graph\n", - " agent = workflow.compile(checkpointer=memory)\n", - " return agent" - ] - }, - { - "cell_type": "markdown", - "id": "765242e9", - "metadata": {}, - "source": [ - "<a id='toc3_2_5__'></a>\n", - "\n", - "#### Instantiate the banking agent\n", - "\n", - "Now, we'll create an instance of the banking agent by calling the workflow creation function.\n", - "\n", - "This compiled agent is ready to process banking requests and will automatically select and use the appropriate tools based on user queries:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "455b8ee4", - "metadata": {}, - "outputs": [], - "source": [ - "# Create the banking intelligent agent\n", - "banking_agent = create_banking_langgraph_agent()\n", - "\n", - "print(\"Banking LangGraph Agent Created Successfully!\")\n", - "print(\"\\nFeatures:\")\n", - "print(\" - Intelligent banking tool selection\")\n", - "print(\" - Comprehensive banking system prompt\")\n", - "print(\" - Streamlined workflow: LLM → Tools → Response\")\n", - "print(\" - Automatic tool parameter extraction\")\n", - "print(\" - Professional banking assistance\")" - ] - }, - { - "cell_type": "markdown", - "id": "e00dac77", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Integrate agent with ValidMind\n", - "\n", - "To integrate our LangGraph banking agent with ValidMind, we need to create a wrapper function that ValidMind can use to invoke the agent and extract the necessary information for testing and documentation, allowing ValidMind to run validation tests on the agent's behavior, tool usage, and responses." - ] - }, - { - "cell_type": "markdown", - "id": "a124857e", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Import ValidMind components\n", - "\n", - "We'll start with importing the necessary ValidMind components for integrating our agent:\n", - "\n", - "- `Prompt` from `validmind.models` for handling prompt-based model inputs\n", - "- `extract_tool_calls_from_agent_output` and `_convert_to_tool_call_list` from `validmind.scorers.llm.deepeval` for extracting and converting tool calls from agent outputs" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9aeb8969", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.models import Prompt\n", - "from validmind.scorers.llm.deepeval import extract_tool_calls_from_agent_output, _convert_to_tool_call_list\n", - "from deepeval.tracing import observe, update_current_span\n", - "from deepeval.test_case import LLMTestCase" - ] - }, - { - "cell_type": "markdown", - "id": "ed72903f", - "metadata": {}, - "source": [ - "<a id='toc3_3_2__'></a>\n", - "\n", - "#### Create agent wrapper function\n", - "\n", - "We'll then create a wrapper function that:\n", - "\n", - "- Accepts input in ValidMind's expected format (with `input` and `session_id` fields)\n", - "- Invokes the banking agent with the proper state initialization\n", - "- Captures tool outputs and tool calls for evaluation\n", - "- Returns a standardized response format that includes the prediction, full output, tool messages, and tool call information\n", - "- Handles errors gracefully with fallback responses" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e4d5a82", - "metadata": {}, - "outputs": [], - "source": [ - "@observe(type=\"agent\")\n", - "def banking_agent_fn(input):\n", - " \"\"\"\n", - " Invoke the banking agent with the given input.\n", - " \"\"\"\n", - " try:\n", - " # Initial state for banking agent\n", - " initial_state = {\n", - " \"user_input\": input[\"input\"],\n", - " \"messages\": [HumanMessage(content=input[\"input\"])],\n", - " \"session_id\": input[\"session_id\"],\n", - " \"context\": {}\n", - " }\n", - " session_config = {\"configurable\": {\"thread_id\": input[\"session_id\"]}}\n", - " result = banking_agent.invoke(initial_state, config=session_config)\n", - "\n", - " from utils import capture_tool_output_messages\n", - "\n", - " # Capture all tool outputs and metadata\n", - " captured_data = capture_tool_output_messages(result)\n", - " \n", - " # Access specific tool outputs, this will be used for RAGAS tests\n", - " tool_message = \"\"\n", - " for output in captured_data[\"tool_outputs\"]:\n", - " tool_message += output['content']\n", - " \n", - " tool_calls_found = []\n", - " messages = result['messages']\n", - " for message in messages:\n", - " if hasattr(message, 'tool_calls') and message.tool_calls:\n", - " for tool_call in message.tool_calls:\n", - " # Handle both dictionary and object formats\n", - " if isinstance(tool_call, dict):\n", - " tool_calls_found.append(tool_call['name'])\n", - " else:\n", - " # ToolCall object - use attribute access\n", - " tool_calls_found.append(tool_call.name)\n", - "\n", - " prediction_text = result['messages'][-1].content[0]['text']\n", - " tools_called_value = _convert_to_tool_call_list(extract_tool_calls_from_agent_output(result))\n", - " expected_tools_value = _convert_to_tool_call_list(input.get(\"expected_tools\", []))\n", - "\n", - " # Feed trace data for DeepEval metrics (e.g. PlanQuality) that require tracing\n", - " update_current_span(\n", - " test_case=LLMTestCase(\n", - " input=input[\"input\"],\n", - " actual_output=prediction_text,\n", - " tools_called=tools_called_value,\n", - " expected_tools=expected_tools_value\n", - " )\n", - " )\n", - "\n", - " return {\n", - " \"prediction\": prediction_text,\n", - " \"output\": result,\n", - " \"tool_messages\": [tool_message],\n", - " # \"tool_calls\": tool_calls_found,\n", - " \"tool_called\": tools_called_value\n", - " }\n", - " except Exception as e:\n", - " # Return a fallback response if the agent fails\n", - " error_message = f\"\"\"I apologize, but I encountered an error while processing your banking request: {str(e)}.\n", - " Please try rephrasing your question or contact support if the issue persists.\"\"\"\n", - " return {\n", - " \"prediction\": error_message, \n", - " \"output\": {\n", - " \"messages\": [HumanMessage(content=input[\"input\"]), SystemMessage(content=error_message)],\n", - " \"error\": str(e)\n", - " }\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "fda87401", - "metadata": {}, - "source": [ - "<a id='toc3_3_3__'></a>\n", - "\n", - "#### Initialize the ValidMind model\n", - "\n", - "We'll also need to register the banking agent as a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model) which:\n", - "\n", - "- Associates the wrapper function with the model for prediction\n", - "- Stores the system prompt template for documentation\n", - "- Provides a unique `input_id` for tracking and identification\n", - "- Enables the agent to be used with ValidMind's testing and documentation features" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "60a2ce7a", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the agent as a model\n", - "vm_banking_model = vm.init_model(\n", - " input_id=\"banking_agent_model\",\n", - " predict_fn=banking_agent_fn,\n", - " prompt=Prompt(template=system_context)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "949bcf53", - "metadata": {}, - "source": [ - "<a id='toc3_3_4__'></a>\n", - "\n", - "#### Store the agent reference\n", - "\n", - "We'll also store a reference to the original banking agent object in the ValidMind model. This allows us to access the full agent functionality directly if needed, while still maintaining the wrapper function interface for ValidMind's testing framework." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c653471", - "metadata": {}, - "outputs": [], - "source": [ - "# Add the banking agent to the vm model\n", - "vm_banking_model.model = banking_agent" - ] - }, - { - "cell_type": "markdown", - "id": "d8d0c1c1", - "metadata": {}, - "source": [ - "<a id='toc3_3_5__'></a>\n", - "\n", - "#### Verify integration\n", - "\n", - "Let's confirm that the banking agent has been successfully integrated with ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8e101b0f", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"Banking Agent Successfully Integrated with ValidMind!\")\n", - "print(f\"Model ID: {vm_banking_model.input_id}\")" - ] - }, - { - "cell_type": "markdown", - "id": "2a5f874e", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Validate the system prompt\n", - "\n", - "Let's get an initial sense of how well our defined system prompt meets a few best practices for prompt engineering by running a few tests — we'll run evaluation tests later on our agent's performance.\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. Passing in our agentic model as an input, the tests below rate the prompt on a scale of 1-10 against the following criteria:\n", - "\n", - "- **[Clarity](https://docs.validmind.ai/tests/prompt_validation/Clarity.html)** — How clearly the prompt states the task.\n", - "- **[Conciseness](https://docs.validmind.ai/tests/prompt_validation/Conciseness.html)** — How succinctly the prompt states the task.\n", - "- **[Delimitation](https://docs.validmind.ai/tests/prompt_validation/Delimitation.html)** — When using complex prompts containing examples, contextual information, or other elements, is the prompt formatted in such a way that each element is clearly separated?\n", - "- **[NegativeInstruction](https://docs.validmind.ai/tests/prompt_validation/NegativeInstruction.html)** — Whether the prompt contains negative instructions.\n", - "- **[Specificity](https://docs.validmind.ai/tests/prompt_validation/NegativeInstruction.html)** — How specific the prompt defines the task." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f52dceb1", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Clarity\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "70d52333", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Conciseness\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5aa89976", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Delimitation\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8630197e", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.NegativeInstruction\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bba99915", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Specificity\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "51d61141", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initializing the ValidMind dataset\n", - "\n", - "After validation our system prompt, let's import our sample dataset ([banking_test_dataset.py](banking_test_dataset.py)), which we'll use in the next section to evaluate our agent's performance across different banking scenarios:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0c70ca2c", - "metadata": {}, - "outputs": [], - "source": [ - "from banking_test_dataset import banking_test_dataset" - ] - }, - { - "cell_type": "markdown", - "id": "442ab66d", - "metadata": {}, - "source": [ - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`text_column`** — The name of the column containing the text input data.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a7e9d158", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset = vm.init_dataset(\n", - " input_id=\"banking_test_dataset\",\n", - " dataset=banking_test_dataset,\n", - " text_column=\"input\",\n", - " target_column=\"possible_outputs\",\n", - ")\n", - "\n", - "print(\"Banking Test Dataset Initialized in ValidMind!\")\n", - "print(f\"Dataset ID: {vm_test_dataset.input_id}\")\n", - "print(f\"Dataset columns: {vm_test_dataset._df.columns}\")\n", - "vm_test_dataset._df" - ] - }, - { - "cell_type": "markdown", - "id": "7b01021c", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Now that both the model object and the datasets have been registered, we'll assign predictions to capture the banking agent's responses for evaluation:\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1d462663", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_predictions(vm_banking_model)\n", - "\n", - "print(\"Banking Agent Predictions Generated Successfully!\")\n", - "print(f\"Predictions assigned to {len(vm_test_dataset._df)} test cases\")\n", - "vm_test_dataset._df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "4e56f556", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running accuracy tests\n", - "\n", - "Using [`@vm.test`](https://docs.validmind.ai/validmind/validmind.html#test), let's implement some reusable custom *inline tests* to assess the accuracy of our banking agent:\n", - "\n", - "- An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", - "- You'll note that the custom test functions are just regular Python functions that can include and require any Python library as you see fit." - ] - }, - { - "cell_type": "markdown", - "id": "1bce9258", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Response accuracy test\n", - "\n", - "We'll create a custom test that evaluates the banking agent's ability to provide accurate responses by:\n", - "\n", - "- Testing against a dataset of predefined banking questions and expected answers.\n", - "- Checking if responses contain expected keywords and banking terminology.\n", - "- Providing detailed test results including pass/fail status.\n", - "- Helping identify any gaps in the agent's banking knowledge or response quality." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "90232066", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "@vm.test(\"my_custom_tests.banking_accuracy_test\")\n", - "def banking_accuracy_test(model, dataset, list_of_columns):\n", - " \"\"\"\n", - " The Banking Accuracy Test evaluates whether the agent’s responses include \n", - " critical domain-specific keywords and phrases that indicate accurate, compliant,\n", - " and contextually appropriate banking information. This test ensures that the agent\n", - " provides responses containing the expected banking terminology, risk classifications,\n", - " account details, or other domain-relevant information required for regulatory compliance,\n", - " customer safety, and operational accuracy.\n", - " \"\"\"\n", - " df = dataset._df\n", - " \n", - " # Pre-compute responses for all tests\n", - " y_true = dataset.y.tolist()\n", - " y_pred = dataset.y_pred(model).tolist()\n", - "\n", - " # Vectorized test results\n", - " test_results = []\n", - " for response, keywords in zip(y_pred, y_true):\n", - " # Convert keywords to list if not already a list\n", - " if not isinstance(keywords, list):\n", - " keywords = [keywords]\n", - " test_results.append(any(str(keyword).lower() in str(response).lower() for keyword in keywords))\n", - " \n", - " results = pd.DataFrame()\n", - " column_names = [col + \"_details\" for col in list_of_columns]\n", - " results[column_names] = df[list_of_columns]\n", - " results[\"actual\"] = y_pred\n", - " results[\"expected\"] = y_true\n", - " results[\"passed\"] = test_results\n", - " results[\"error\"] = None if test_results else f'Response did not contain any expected keywords: {y_true}'\n", - " \n", - " return results" - ] - }, - { - "cell_type": "markdown", - "id": "2a7f71f8", - "metadata": {}, - "source": [ - "Now that we've defined our custom response accuracy test, we can run the test using the same `run_test()` function we used earlier to validate the system prompt using our sample dataset and agentic model as input, and log the test results to the ValidMind Platform with the [`log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#log):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68884d5", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.banking_accuracy_test\",\n", - " inputs={\n", - " \"dataset\": vm_test_dataset,\n", - " \"model\": vm_banking_model\n", - " },\n", - " params={\n", - " \"list_of_columns\": [\"input\"]\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "94a717e7", - "metadata": {}, - "source": [ - "Let's review the first five rows of the test dataset to inspect the results to see how well the banking agent performed. Each column in the output serves a specific purpose in evaluating agent performance:\n", - "\n", - "| Column header | Description | Importance |\n", - "|--------------|-------------|------------|\n", - "| **`input`** | Original user query or request | Essential for understanding the context of each test case and tracing which inputs led to specific agent behaviors. |\n", - "| **`expected_tools`** | Banking tools that should be invoked for this request | Enables validation of correct tool selection, which is critical for agentic AI systems where choosing the right tool is a key success metric. |\n", - "| **`expected_output`** | Expected output or keywords that should appear in the response | Defines the success criteria for each test case, enabling objective evaluation of whether the agent produced the correct result. |\n", - "| **`session_id`** | Unique identifier for each test session | Allows tracking and correlation of related test runs, debugging specific sessions, and maintaining audit trails. |\n", - "| **`category`** | Classification of the request type | Helps organize test results by domain and identify performance patterns across different banking use cases. |\n", - "| **`banking_agent_model_output`** | Complete agent response including all messages and reasoning | Allows you to examine the full output to assess response quality, completeness, and correctness beyond just keyword matching. |\n", - "| **`banking_agent_model_tool_messages`** | Messages exchanged with the banking tools | Critical for understanding how the agent interacted with tools, what parameters were passed, and what tool outputs were received. |\n", - "| **`banking_agent_model_tool_called`** | Specific tool that was invoked | Enables validation that the agent selected the correct tool for each request, which is fundamental to agentic AI validation. |\n", - "| **`possible_outputs`** | Alternative valid outputs or keywords that could appear in the response | Provides flexibility in evaluation by accounting for multiple acceptable response formats or variations. |" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "78f7edb1", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.df.head(5)" - ] - }, - { - "cell_type": "markdown", - "id": "1cb3e8bd", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Tool selection accuracy test\n", - "\n", - "We'll also create a custom test that evaluates the banking agent's ability to select the correct tools for different requests by:\n", - "\n", - "- Testing against a dataset of predefined banking queries with expected tool selections.\n", - "- Comparing the tools actually invoked by the agent against the expected tools for each request.\n", - "- Providing quantitative accuracy scores that measure the proportion of expected tools correctly selected.\n", - "- Helping identify gaps in the agent's understanding of user needs and tool selection logic." - ] - }, - { - "cell_type": "markdown", - "id": "69263d62", - "metadata": {}, - "source": [ - "First, we'll define a helper function that extracts tool calls from the agent's messages and compares them against the expected tools. This function handles different message formats (dictionary or object) and calculates accuracy scores:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68798be", - "metadata": {}, - "outputs": [], - "source": [ - "def validate_tool_calls_simple(messages, expected_tools):\n", - " \"\"\"Simple validation of tool calls without RAGAS dependency issues.\"\"\"\n", - " \n", - " tool_calls_found = []\n", - " \n", - " for message in messages:\n", - " if hasattr(message, 'tool_calls') and message.tool_calls:\n", - " for tool_call in message.tool_calls:\n", - " # Handle both dictionary and object formats\n", - " if isinstance(tool_call, dict):\n", - " tool_calls_found.append(tool_call['name'])\n", - " else:\n", - " # ToolCall object - use attribute access\n", - " tool_calls_found.append(tool_call.name)\n", - " \n", - " # Check if expected tools were called\n", - " accuracy = 0.0\n", - " matches = 0\n", - " if expected_tools:\n", - " matches = sum(1 for tool in expected_tools if tool in tool_calls_found)\n", - " accuracy = matches / len(expected_tools)\n", - " \n", - " return {\n", - " 'expected_tools': expected_tools,\n", - " 'found_tools': tool_calls_found,\n", - " 'matches': matches,\n", - " 'total_expected': len(expected_tools) if expected_tools else 0,\n", - " 'accuracy': accuracy,\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "8f494fd3", - "metadata": {}, - "source": [ - "Now we'll define the main test function that uses the helper function to evaluate tool selection accuracy across all test cases in the dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "604d7313", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BankingToolCallAccuracy\")\n", - "def BankingToolCallAccuracy(dataset, agent_output_column, expected_tools_column):\n", - " \"\"\"\n", - " Evaluates the tool selection accuracy of a LangGraph-powered banking agent.\n", - "\n", - " This test measures whether the agent correctly identifies and invokes the required banking tools\n", - " for each user query scenario.\n", - " For each case, the outputs generated by the agent (including its tool calls) are compared against an\n", - " expected set of tools. The test considers both coverage and exactness: it computes the proportion of\n", - " expected tools correctly called by the agent for each instance.\n", - "\n", - " Parameters:\n", - " dataset (VMDataset): The dataset containing user queries, agent outputs, and ground-truth tool expectations.\n", - " agent_output_column (str): Dataset column name containing agent outputs (should include tool call details in 'messages').\n", - " expected_tools_column (str): Dataset column specifying the true expected tools (as lists).\n", - "\n", - " Returns:\n", - " List[dict]: Per-row dictionaries with details: expected tools, found tools, match count, total expected, and accuracy score.\n", - "\n", - " Purpose:\n", - " Provides diagnostic evidence of the banking agent's core reasoning ability—specifically, its capacity to\n", - " interpret user needs and select the correct banking actions. Useful for diagnosing gaps in tool coverage,\n", - " misclassifications, or breakdowns in agent logic.\n", - "\n", - " Interpretation:\n", - " - An accuracy of 1.0 signals perfect tool selection for that example.\n", - " - Lower scores may indicate partial or complete failures to invoke required tools.\n", - " - Review 'found_tools' vs. 'expected_tools' to understand the source of discrepancies.\n", - "\n", - " Strengths:\n", - " - Directly tests a core capability of compositional tool-use agents.\n", - " - Framework-agnostic; robust to tool call output format (object or dict).\n", - " - Supports batch validation and result logging for systematic documentation.\n", - "\n", - " Limitations:\n", - " - Does not penalize extra, unnecessary tool calls.\n", - " - Does not assess result quality—only correct invocation.\n", - "\n", - " \"\"\"\n", - " df = dataset._df\n", - " \n", - " results = []\n", - " for i, row in df.iterrows():\n", - " result = validate_tool_calls_simple(row[agent_output_column]['messages'], row[expected_tools_column])\n", - " results.append(result)\n", - " \n", - " return results" - ] - }, - { - "cell_type": "markdown", - "id": "57ab606b", - "metadata": {}, - "source": [ - "Finally, we can call our function with `run_test()` and log the test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dd14115e", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.BankingToolCallAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_test_dataset,\n", - " },\n", - " params={\n", - " \"agent_output_column\": \"banking_agent_model_output\",\n", - " \"expected_tools_column\": \"expected_tools\"\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "be8d5270", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Assigning AI evaluation metric scores\n", - "\n", - "*AI agent evaluation metrics* are specialized measurements designed to assess how well autonomous LLM-based agents reason, plan, select and execute tools, and ultimately complete user tasks by analyzing the *full execution trace* — including reasoning steps, tool calls, intermediate decisions, and outcomes, rather than just single input–output pairs. These metrics are essential because agent failures often occur in ways traditional LLM metrics miss — for example, choosing the right tool with wrong arguments, creating a good plan but not following it, or completing a task inefficiently.\n", - "\n", - "In this section, we'll evaluate our banking agent's outputs and add scoring to our sample dataset against metrics defined in [DeepEval’s AI agent evaluation framework](https://deepeval.com/guides/guides-ai-agent-evaluation-metrics) which breaks down AI agent evaluation into three layers with corresponding subcategories: **reasoning**, **action**, and **execution**.\n", - "\n", - "Together, these three metrics enable granular diagnosis of agent behavior, help pinpoint where failures occur (reasoning, action, or execution), and support both development benchmarking and production monitoring." - ] - }, - { - "cell_type": "markdown", - "id": "25828bef", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify relevant DeepEval scorers\n", - "\n", - "*Scorers* are evaluation metrics that analyze model outputs and store their results in the dataset:\n", - "\n", - "- Each scorer adds a new column to the dataset with format: `{scorer_name}_{metric_name}`\n", - "- The column contains the numeric score (typically `0`-`1`) for each example\n", - "- Multiple scorers can be run on the same dataset, each adding their own column\n", - "- Scores are persisted in the dataset for later analysis and visualization\n", - "- Common scorer patterns include:\n", - " - Model performance metrics (accuracy, F1, etc.)\n", - " - Output quality metrics (relevance, faithfulness)\n", - " - Task-specific metrics (completion, correctness)\n", - "\n", - "Use `list_scorers()` from [`validmind.scorers`](https://docs.validmind.ai/validmind/validmind/tests.html#scorer) to discover all available scoring methods and their IDs that can be used with `assign_scores()`. We'll filter these results to return only DeepEval scorers for our desired three metrics in a formatted table with descriptions:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "730c70ec", - "metadata": {}, - "outputs": [], - "source": [ - "# Load all DeepEval scorers\n", - "llm_scorers_dict = vm.tests.load._load_tests([s for s in vm.scorer.list_scorers() if \"deepeval\" in s.lower()])\n", - "\n", - "# Categorize scorers by metric layer\n", - "reasoning_scorers = {}\n", - "action_scorers = {}\n", - "execution_scorers = {}\n", - "\n", - "for scorer_id, scorer_func in llm_scorers_dict.items():\n", - " tags = getattr(scorer_func, \"__tags__\", [])\n", - " scorer_name = scorer_id.split(\".\")[-1]\n", - "\n", - " if \"reasoning_layer\" in tags:\n", - " reasoning_scorers[scorer_id] = scorer_func\n", - " elif \"action_layer\" in tags:\n", - " action_scorers[scorer_id] = scorer_func\n", - " elif \"TaskCompletion\" in scorer_name:\n", - " execution_scorers[scorer_id] = scorer_func\n", - "\n", - "# Display scorers by category\n", - "print(\"=\" * 80)\n", - "print(\"REASONING LAYER\")\n", - "print(\"=\" * 80)\n", - "if reasoning_scorers:\n", - " reasoning_df = vm.tests.load._pretty_list_tests(reasoning_scorers, truncate=True)\n", - " display(reasoning_df)\n", - "else:\n", - " print(\"No reasoning layer scorers found.\")\n", - "\n", - "print(\"\\n\" + \"=\" * 80)\n", - "print(\"ACTION LAYER\")\n", - "print(\"=\" * 80)\n", - "if action_scorers:\n", - " action_df = vm.tests.load._pretty_list_tests(action_scorers, truncate=True)\n", - " display(action_df)\n", - "else:\n", - " print(\"No action layer scorers found.\")\n", - "\n", - "print(\"\\n\" + \"=\" * 80)\n", - "print(\"EXECUTION LAYER\")\n", - "print(\"=\" * 80)\n", - "if execution_scorers:\n", - " execution_df = vm.tests.load._pretty_list_tests(execution_scorers, truncate=True)\n", - " display(execution_df)\n", - "else:\n", - " print(\"No execution layer scorers found.\")" - ] - }, - { - "cell_type": "markdown", - "id": "e5fb739b", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Assign reasoning scores\n", - "\n", - "*Reasoning* evaluates planning and strategy generation:\n", - "\n", - "- **Plan quality** – How logical, complete, and efficient the agent’s plan is.\n", - "- **Plan adherence** – Whether the agent follows its own plan during execution." - ] - }, - { - "cell_type": "markdown", - "id": "fde94d01", - "metadata": {}, - "source": [ - "<a id='toc6_2_1__'></a>\n", - "\n", - "#### Plan quality score\n", - "\n", - "Let's measure how well our banking agent generates a plan before acting. A high score means the plan is logical, complete, and efficient." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "52f362ba", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.PlanQuality\",\n", - " model = vm_banking_model,\n", - " input_column = \"input\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_PlanQuality_score','banking_agent_model_PlanQuality_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "d631fd12", - "metadata": {}, - "source": [ - "<a id='toc6_2_2__'></a>\n", - "\n", - "#### Plan adherence score\n", - "\n", - "Let's check whether our banking agent follows the plan it created. Deviations lower this score and indicate gaps between reasoning and execution." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4124a7c2", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.PlanAdherence\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_PlanAdherence_score','banking_agent_model_PlanAdherence_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "82e5e6f1", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign action scores\n", - "\n", - "*Action* assesses tool usage and argument generation:\n", - "\n", - "- **Tool correctness** – Whether the agent selects and calls the right tools.\n", - "- **Argument correctness** – Whether the agent generates correct tool arguments." - ] - }, - { - "cell_type": "markdown", - "id": "e641c9f2", - "metadata": {}, - "source": [ - "<a id='toc6_3_1__'></a>\n", - "\n", - "#### Tool correctness score\n", - "\n", - "Let's evaluate if our banking agent selects the appropriate tool for the task. Choosing the wrong tool reduces performance even if reasoning was correct." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8d2e8a25", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.ToolCorrectness\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " expected_tools_called_column = \"expected_tools\",\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_ToolCorrectness_score','banking_agent_model_ToolCorrectness_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "dd758ba5", - "metadata": {}, - "source": [ - "<a id='toc6_3_2__'></a>\n", - "\n", - "#### Argument correctness score\n", - "\n", - "Let's assesses whether our banking agent provides correct inputs or arguments to the selected tool. Incorrect arguments can lead to failed or unexpected results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "04f90489", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.ArgumentCorrectness\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_ArgumentCorrectness_score','banking_agent_model_ArgumentCorrectness_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "1aeec2f5", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Assign execution score\n", - "\n", - "*Execution* measures end-to-end performance:\n", - "\n", - "- **Task completion** – Whether the agent successfully completes the intended task." - ] - }, - { - "cell_type": "markdown", - "id": "eb9ab8de", - "metadata": {}, - "source": [ - "<a id='toc6_4_1__'></a>\n", - "\n", - "#### Task completion score\n", - "\n", - "Let's evaluate whether our banking agent successfully completes the requested tasks. Incomplete task execution can lead to user dissatisfaction and failed banking operations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "05024f1f", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.TaskCompletion\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_TaskCompletion_score','banking_agent_model_TaskCompletion_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "b577c282", - "metadata": {}, - "source": [ - "As you recall from the beginning of this section, when we run scorers through `assign_scores()`, the return values are automatically processed and added as new columns with the format `{scorer_name}_{metric_name}`. Note that the task completion scorer has added a new column `TaskCompletion_score` to our dataset.\n", - "\n", - "We'll use this column to visualize the distribution of task completion scores across our test cases through the [BoxPlot test](https://docs.validmind.ai/validmind/validmind/tests/plots/BoxPlot.html#boxplot):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f6d08ca", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.plots.BoxPlot\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " params={\n", - " \"columns\": \"banking_agent_model_TaskCompletion_score\",\n", - " \"title\": \"Distribution of Task Completion Scores\",\n", - " \"ylabel\": \"Score\",\n", - " \"figsize\": (8, 6)\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "30d9ec62", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Running RAGAS tests\n", - "\n", - "Next, let's run some out-of-the-box *Retrieval-Augmented Generation Assessment* (RAGAS) tests available in the ValidMind Library. RAGAS provides specialized metrics for evaluating retrieval-augmented generation systems and conversational AI agents. These metrics analyze different aspects of agent performance by assessing how well systems integrate retrieved information with generated responses.\n", - "\n", - "Our banking agent uses tools to retrieve information and generates responses based on that context, making it similar to a RAG system. RAGAS metrics help evaluate the quality of this integration by analyzing the relationship between retrieved tool outputs, user queries, and generated responses.\n", - "\n", - "These tests provide insights into how well our banking agent integrates tool usage with conversational abilities, ensuring it provides accurate, relevant, and helpful responses to banking users while maintaining fidelity to retrieved information." - ] - }, - { - "cell_type": "markdown", - "id": "8288f6c3", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Identify relevant RAGAS tests\n", - "\n", - "Let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your development testing, and helps you ensure that your records are being documented and evaluated appropriately.\n", - "\n", - "You can pass `tasks` and `tags` as parameters to the [`vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to filter the tests based on the tags and task types:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `text_qa` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `ragas` tag.\n", - "\n", - "We'll then run three of these tests returned as examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0701f5a9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(task=\"text_qa\", tags=[\"ragas\"])" - ] - }, - { - "cell_type": "markdown", - "id": "2ce24ba0", - "metadata": {}, - "source": [ - "<a id='toc7_1_1__'></a>\n", - "\n", - "#### Faithfulness\n", - "\n", - "Let's evaluate whether the banking agent's responses accurately reflect the information retrieved from tools. Unfaithful responses can misreport credit analysis, financial calculations, and compliance results—undermining user trust in the banking agent." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "92044533", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.Faithfulness\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"response_column\": [\"banking_agent_model_prediction\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "4d1fcfcd", - "metadata": {}, - "source": [ - "<a id='toc7_1_2__'></a>\n", - "\n", - "#### Response Relevancy\n", - "\n", - "Let's evaluate whether the banking agent's answers address the user's original question or request. Irrelevant or off-topic responses can frustrate users and fail to deliver the banking information they need." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d7483bc3", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.ResponseRelevancy\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " params={\n", - " \"user_input_column\": \"input\",\n", - " \"response_column\": \"banking_agent_model_prediction\",\n", - " \"retrieved_contexts_column\": \"banking_agent_model_tool_messages\",\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "38c1dfb5", - "metadata": {}, - "source": [ - "<a id='toc7_1_3__'></a>\n", - "\n", - "#### Context Recall\n", - "\n", - "Let's evaluate how well the banking agent uses the information retrieved from tools when generating its responses. Poor context recall can lead to incomplete or underinformed answers even when the right tools were selected." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e5dc00ce", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.ContextRecall\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " \"reference_column\": [\"banking_agent_model_prediction\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "95e1e16a", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Running safety tests\n", - "\n", - "Finally, let's run some out-of-the-box *safety* tests available in the ValidMind Library. Safety tests provide specialized metrics for evaluating whether AI agents operate reliably and securely. These metrics analyze different aspects of agent behavior by assessing adherence to safety guidelines, consistency of outputs, and resistance to harmful or inappropriate requests.\n", - "\n", - "Our banking agent handles sensitive financial information and user requests, making safety and reliability essential. Safety tests help evaluate whether the agent maintains appropriate boundaries, responds consistently and correctly to inputs, and avoids generating harmful, biased, or unprofessional content.\n", - "\n", - "These tests provide insights into how well our banking agent upholds standards of fairness and professionalism, ensuring it operates reliably and securely for banking users." - ] - }, - { - "cell_type": "markdown", - "id": "e0972afa", - "metadata": {}, - "source": [ - "<a id='toc8_1_1__'></a>\n", - "\n", - "#### AspectCritic\n", - "\n", - "Let's evaluate our banking agent's responses across multiple quality dimensions — conciseness, coherence, correctness, harmfulness, and maliciousness. Weak performance on these dimensions can degrade user experience, fall short of professional banking standards, or introduce safety risks. \n", - "\n", - "We'll use the `AspectCritic` we identified earlier:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "148daa2b", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.AspectCritic\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"response_column\": [\"banking_agent_model_prediction\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "16f29c8d", - "metadata": {}, - "source": [ - "<a id='toc8_1_2__'></a>\n", - "\n", - "#### Bias\n", - "\n", - "Let's evaluate whether our banking agent's prompts contain unintended biases that could affect banking decisions. Biased prompts can lead to unfair or discriminatory outcomes — undermining customer trust and exposing the institution to compliance risk.\n", - "\n", - "We'll first use `list_tests()` again to filter for tests relating to `prompt_validation`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "74eba86c", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"prompt_validation\")" - ] - }, - { - "cell_type": "markdown", - "id": "e9413803", - "metadata": {}, - "source": [ - "And then run the identified `Bias` test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "062cf8e7", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Bias\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "8f3f2dbe", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "8716165d", - "metadata": {}, - "source": [ - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "3. Click into any section related to the tests we ran in this notebook, for example: **4.3. Prompt Evaluation** to review the results of the tests we logged." - ] - }, - { - "cell_type": "markdown", - "id": "7c4a78ce", - "metadata": {}, - "source": [ - "<a id='toc9_2__'></a>\n", - "\n", - "### Customize the banking agent for your use case\n", - "\n", - "You've now built an agentic AI system designed for banking use cases that supports compliance with supervisory guidance such as SR 11-7 and SS1/23, covering credit and fraud risk assessment for both retail and commercial banking. Extend this example agent to real-world banking scenarios and production deployment by:\n", - "\n", - "- Adapting the banking tools to your organization's specific requirements\n", - "- Adding more banking scenarios and edge cases to your test set\n", - "- Connecting the agent to your banking systems and databases\n", - "- Implementing additional banking-specific tools and workflows" - ] - }, - { - "cell_type": "markdown", - "id": "7f9385d3", - "metadata": {}, - "source": [ - "<a id='toc9_3__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "Learn more about the ValidMind Library tools we used in this notebook:\n", - "\n", - "- [Custom prompts](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html)\n", - "- [Custom tests](https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html)\n", - "- [ValidMind scorers](https://docs.validmind.ai/notebooks/how_to/scoring/assign_scores_complete_tutorial.html)\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "fdd5c0db", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9733adff", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "829429fd", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "55339760", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-b9e82bcf4e364c4f8e5ae4bb0e4b2865", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.11", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an agentic AI system\n", + "\n", + "Build and document an agentic AI system with the ValidMind Library. Construct a LangGraph-based banking agent, assign AI evaluation metric scores to your agent, and run accuracy, RAGAS, and safety tests, then log those test results to the ValidMind Platform.\n", + "\n", + "An _AI agent_ is an autonomous system that interprets inputs, selects from available tools or actions, and executes multi-step behaviors to achieve defined goals. In this notebook, the agent acts as a banking assistant that analyzes user requests and automatically selects and invokes the appropriate specialized banking tool to deliver accurate, compliant, and actionable responses.\n", + "\n", + "- This agent enables financial institutions to automate complex banking workflows where different customer requests require different specialized tools and knowledge bases.\n", + "- Effective validation of agentic AI systems reduces the risks of agents misinterpreting inputs, failing to extract required parameters, or producing incorrect assessments or actions — such as selecting the wrong tool.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For the LLM components in this notebook to function properly, you'll need access to OpenAI.</b></span>\n", + "<br></br>\n", + "Before you continue, ensure that a valid <code>OPENAI_API_KEY</code> is set in your <code>.env</code> file.</div>" + ], + "id": "eee6b64c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_2_4__) \n", + " - [Verify OpenAI API access](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + "- [Building the LangGraph agent](#toc3__) \n", + " - [Test available banking tools](#toc3_1__) \n", + " - [Create LangGraph banking agent](#toc3_2__) \n", + " - [Define system prompt](#toc3_2_1__) \n", + " - [Initialize the LLM](#toc3_2_2__) \n", + " - [Define agent state structure](#toc3_2_3__) \n", + " - [Create agent workflow function](#toc3_2_4__) \n", + " - [Instantiate the banking agent](#toc3_2_5__) \n", + " - [Integrate agent with ValidMind](#toc3_3__) \n", + " - [Import ValidMind components](#toc3_3_1__) \n", + " - [Create agent wrapper function](#toc3_3_2__) \n", + " - [Initialize the ValidMind model object](#toc3_3_3__) \n", + " - [Store the agent reference](#toc3_3_4__) \n", + " - [Verify integration](#toc3_3_5__) \n", + " - [Validate the system prompt](#toc3_4__) \n", + "- [Initializing the ValidMind dataset](#toc4__) \n", + " - [Assign predictions](#toc4_1__) \n", + "- [Running accuracy tests](#toc5__) \n", + " - [Response accuracy test](#toc5_1__) \n", + " - [Tool selection accuracy test](#toc5_2__) \n", + "- [Assigning AI evaluation metric scores](#toc6__) \n", + " - [Identify relevant DeepEval scorers](#toc6_1__) \n", + " - [Assign reasoning scores](#toc6_2__) \n", + " - [Plan quality score](#toc6_2_1__) \n", + " - [Plan adherence score](#toc6_2_2__) \n", + " - [Assign action scores](#toc6_3__) \n", + " - [Tool correctness score](#toc6_3_1__) \n", + " - [Argument correctness score](#toc6_3_2__) \n", + " - [Assign execution score](#toc6_4__) \n", + " - [Task completion score](#toc6_4_1__) \n", + "- [Running RAGAS tests](#toc7__) \n", + " - [Identify relevant RAGAS tests](#toc7_1__) \n", + " - [Faithfulness](#toc7_1_1__) \n", + " - [Response Relevancy](#toc7_1_2__) \n", + " - [Context Recall](#toc7_1_3__) \n", + "- [Running safety tests](#toc8__) \n", + " - [AspectCritic](#toc8_1_1__) \n", + " - [Bias](#toc8_1_2__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Customize the banking agent for your use case](#toc9_2__) \n", + " - [Discover more learning resources](#toc9_3__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "30927b2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "b58139db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "7e30d36b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "1cba586e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "5c46f003" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "11a2d7a5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.9 <= x <= 3.14</div>\n", + "\n", + "Let's begin by installing the ValidMind Library with large language model (LLM) support:" + ], + "id": "fbab0edf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q \"validmind[llm]\" \"langgraph==0.3.21\"" + ], + "execution_count": null, + "outputs": [], + "id": "1982a118" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "14578e26" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "83d47d89" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Agentic AI`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "bb2c5670" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"agentic_ai_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "98e475c1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "0d1a13ca" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d6ccbefc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "3605df4f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "dffdaa6f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Verify OpenAI API access\n", + "\n", + "Verify that a valid `OPENAI_API_KEY` is set in your `.env` file:" + ], + "id": "d467c1d2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load environment variables if using .env file\n", + "try:\n", + " from dotenv import load_dotenv\n", + " load_dotenv()\n", + "except ImportError:\n", + " print(\"dotenv not installed. Make sure OPENAI_API_KEY is set in your environment.\")" + ], + "execution_count": null, + "outputs": [], + "id": "22cc39cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Let's import all the necessary libraries to prepare for building our banking LangGraph agentic system:\n", + "\n", + "- **Standard libraries** for data handling and environment management.\n", + "- **pandas**, a Python library for data manipulation and analytics, as an alias. We'll also configure pandas to show all columns and all rows at full width for easier debugging and inspection.\n", + "- **LangChain** components for LLM integration and tool management.\n", + "- **LangGraph** for building stateful, multi-step agent workflows.\n", + "- **Banking tools** for specialized financial services as defined in [banking_tools.py](banking_tools.py)." + ], + "id": "b56c3f39" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from typing import TypedDict, Annotated, Sequence\n", + "\n", + "from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage\n", + "from langchain_openai import ChatOpenAI\n", + "from langgraph.checkpoint.memory import MemorySaver\n", + "from langgraph.graph import StateGraph, END, START\n", + "from langgraph.graph.message import add_messages\n", + "from langgraph.prebuilt import ToolNode\n", + "\n", + "# LOCAL IMPORTS FROM banking_tools.py\n", + "from banking_tools import AVAILABLE_TOOLS\n", + "\n", + "import pandas as pd\n", + "# Configure pandas to show all columns and all rows at full width\n", + "pd.set_option('display.max_columns', None)\n", + "pd.set_option('display.max_colwidth', None)\n", + "pd.set_option('display.width', None)\n", + "pd.set_option('display.max_rows', None)" + ], + "execution_count": null, + "outputs": [], + "id": "2058d1ac" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Building the LangGraph agent" + ], + "id": "cc1d3265" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Test available banking tools\n", + "\n", + "We'll use the demo banking tools defined in `banking_tools.py` that provide use cases of financial services:\n", + "\n", + "- **Credit Risk Analyzer** - Loan applications and credit decisions\n", + "- **Customer Account Manager** - Account services and customer support\n", + "- **Fraud Detection System** - Security and fraud prevention" + ], + "id": "a3c421c4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(f\"Available tools: {len(AVAILABLE_TOOLS)}\")\n", + "print(\"\\nTool Details:\")\n", + "for i, tool in enumerate(AVAILABLE_TOOLS, 1):\n", + " print(f\" - {tool.name}\")" + ], + "execution_count": null, + "outputs": [], + "id": "1e0a120c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's test each banking tool individually to ensure they're working correctly before integrating them into our agent:" + ], + "id": "53906630" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Test 1: Credit Risk Analyzer\n", + "print(\"TEST 1: Credit Risk Analyzer\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " # Access the underlying function using .func\n", + " credit_result = AVAILABLE_TOOLS[0].func(\n", + " customer_income=75000,\n", + " customer_debt=1200,\n", + " credit_score=720,\n", + " loan_amount=50000,\n", + " loan_type=\"personal\"\n", + " )\n", + " print(credit_result)\n", + " print(\"Credit Risk Analyzer test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Credit Risk Analyzer test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "dc0caff2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Test 2: Customer Account Manager\n", + "print(\"TEST 2: Customer Account Manager\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " # Test checking balance\n", + " account_result = AVAILABLE_TOOLS[1].func(\n", + " account_type=\"checking\",\n", + " customer_id=\"12345\",\n", + " action=\"check_balance\"\n", + " )\n", + " print(account_result)\n", + "\n", + " # Test getting account info\n", + " info_result = AVAILABLE_TOOLS[1].func(\n", + " account_type=\"all\",\n", + " customer_id=\"12345\", \n", + " action=\"get_info\"\n", + " )\n", + " print(info_result)\n", + " print(\"Customer Account Manager test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Customer Account Manager test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "b6b227db" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Test 3: Fraud Detection System\n", + "print(\"TEST 3: Fraud Detection System\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " fraud_result = AVAILABLE_TOOLS[2].func(\n", + " transaction_id=\"TX123\",\n", + " customer_id=\"12345\",\n", + " transaction_amount=500.00,\n", + " transaction_type=\"withdrawal\",\n", + " location=\"Miami, FL\",\n", + " device_id=\"DEVICE_001\"\n", + " )\n", + " print(fraud_result)\n", + " print(\"Fraud Detection System test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Fraud Detection System test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "a983b30d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Create LangGraph banking agent\n", + "\n", + "With our tools ready to go, we'll create our intelligent banking agent with LangGraph that automatically selects and uses the appropriate banking tool based on a user request." + ], + "id": "1424baed" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Define system prompt\n", + "\n", + "We'll begin by defining our system prompt, which provides the LLM with context about its role as a banking assistant and guidance on when to use each available tool:" + ], + "id": "3469d656" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Enhanced banking system prompt with tool selection guidance\n", + "system_context = \"\"\"You are a professional banking AI assistant with access to specialized banking tools.\n", + " Analyze the user's banking request and directly use the most appropriate tools to help them.\n", + " \n", + " AVAILABLE BANKING TOOLS:\n", + " \n", + " credit_risk_analyzer - Analyze credit risk for loan applications and credit decisions\n", + " - Use for: loan applications, credit assessments, risk analysis, mortgage eligibility\n", + " - Examples: \"Analyze credit risk for $50k personal loan\", \"Assess mortgage eligibility for $300k home purchase\"\n", + " - Parameters: customer_income, customer_debt, credit_score, loan_amount, loan_type\n", + "\n", + " customer_account_manager - Manage customer accounts and provide banking services\n", + " - Use for: account information, transaction processing, product recommendations, customer service\n", + " - Examples: \"Check balance for checking account 12345\", \"Recommend products for customer with high balance\"\n", + " - Parameters: account_type, customer_id, action, amount, account_details\n", + "\n", + " fraud_detection_system - Analyze transactions for potential fraud and security risks\n", + " - Use for: transaction monitoring, fraud prevention, risk assessment, security alerts\n", + " - Examples: \"Analyze fraud risk for $500 ATM withdrawal in Miami\", \"Check security for $2000 online purchase\"\n", + " - Parameters: transaction_id, customer_id, transaction_amount, transaction_type, location, device_id\n", + "\n", + " BANKING INSTRUCTIONS:\n", + " - Analyze the user's banking request carefully and identify the primary need\n", + " - If they need credit analysis → use credit_risk_analyzer\n", + " - If they need financial calculations → use financial_calculator\n", + " - If they need account services → use customer_account_manager\n", + " - If they need security analysis → use fraud_detection_system\n", + " - Extract relevant parameters from the user's request\n", + " - Provide helpful, accurate banking responses based on tool outputs\n", + " - Always consider banking regulations, risk management, and best practices\n", + " - Be professional and thorough in your analysis\n", + "\n", + " Choose and use tools wisely to provide the most helpful banking assistance.\n", + " Describe the response in user friendly manner with details describing the tool output. \n", + " Provide the response in at least 500 words.\n", + " Generate a concise execution plan for the banking request.\n", + " \"\"\"" + ], + "execution_count": null, + "outputs": [], + "id": "7971c427" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Initialize the LLM\n", + "\n", + "Let's initialize the LLM that will power our banking agent:" + ], + "id": "b66c1ac4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the main LLM for banking responses\n", + "main_llm = ChatOpenAI(\n", + " model=\"gpt-5-mini\",\n", + " reasoning={\n", + " \"effort\": \"low\",\n", + " \"summary\": \"auto\"\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "866066e7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then bind the available banking tools to the LLM, enabling the model to automatically recognize and invoke each tool when appropriate based on request input and the system prompt we defined above:" + ], + "id": "8220afd6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Bind all banking tools to the main LLM\n", + "llm_with_tools = main_llm.bind_tools(AVAILABLE_TOOLS)" + ], + "execution_count": null, + "outputs": [], + "id": "906d8132" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Define agent state structure\n", + "\n", + "The agent state defines the data structure that flows through the LangGraph workflow. It includes:\n", + "\n", + "- **messages** — The conversation history between the user and agent\n", + "- **user_input** — The current user request\n", + "- **session_id** — A unique identifier for the conversation session\n", + "- **context** — Additional context that can be passed between nodes\n", + "\n", + "Defining this state structure maintains the structure throughout the agent's execution and allows for multi-turn conversations with memory:" + ], + "id": "43f56651" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Banking Agent State Definition\n", + "class BankingAgentState(TypedDict):\n", + " messages: Annotated[Sequence[BaseMessage], add_messages]\n", + " user_input: str\n", + " session_id: str\n", + " context: dict" + ], + "execution_count": null, + "outputs": [], + "id": "6b926ddf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_4__'></a>\n", + "\n", + "#### Create agent workflow function\n", + "\n", + "We'll build the LangGraph agent workflow with two main components:\n", + "\n", + "1. **LLM node** — Processes user requests, applies the system prompt, and decides whether to use tools.\n", + "2. **Tools node** — Executes the selected banking tools when the LLM determines they're needed.\n", + "\n", + "The workflow begins with the LLM analyzing the request, then uses tools if needed — or ends if the response is complete, and finally returns to the LLM to generate the final response." + ], + "id": "387ba780" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def create_banking_langgraph_agent():\n", + " \"\"\"Create a comprehensive LangGraph banking agent with intelligent tool selection.\"\"\"\n", + " def llm_node(state: BankingAgentState) -> BankingAgentState:\n", + " \"\"\"Main LLM node that processes banking requests and selects appropriate tools.\"\"\"\n", + " messages = state[\"messages\"]\n", + " # Add system context to messages\n", + " enhanced_messages = [SystemMessage(content=system_context)] + list(messages)\n", + " # Get LLM response with tool selection\n", + " response = llm_with_tools.invoke(enhanced_messages)\n", + " return {\n", + " **state,\n", + " \"messages\": messages + [response]\n", + " }\n", + " \n", + " def should_continue(state: BankingAgentState) -> str:\n", + " \"\"\"Decide whether to use tools or end the conversation.\"\"\"\n", + " last_message = state[\"messages\"][-1]\n", + " # Check if the LLM wants to use tools\n", + " if hasattr(last_message, 'tool_calls') and last_message.tool_calls:\n", + " return \"tools\"\n", + " return END\n", + " \n", + " # Create the banking state graph\n", + " workflow = StateGraph(BankingAgentState)\n", + " # Add nodes\n", + " workflow.add_node(\"llm\", llm_node)\n", + " workflow.add_node(\"tools\", ToolNode(AVAILABLE_TOOLS))\n", + " # Simplified entry point - go directly to LLM\n", + " workflow.add_edge(START, \"llm\")\n", + " # From LLM, decide whether to use tools or end\n", + " workflow.add_conditional_edges(\n", + " \"llm\",\n", + " should_continue,\n", + " {\"tools\": \"tools\", END: END}\n", + " )\n", + " # Tool execution flows back to LLM for final response\n", + " workflow.add_edge(\"tools\", \"llm\")\n", + " # Set up memory\n", + " memory = MemorySaver()\n", + " # Compile the graph\n", + " agent = workflow.compile(checkpointer=memory)\n", + " return agent" + ], + "execution_count": null, + "outputs": [], + "id": "2c9bf585" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_5__'></a>\n", + "\n", + "#### Instantiate the banking agent\n", + "\n", + "Now, we'll create an instance of the banking agent by calling the workflow creation function.\n", + "\n", + "This compiled agent is ready to process banking requests and will automatically select and use the appropriate tools based on user queries:" + ], + "id": "765242e9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Create the banking intelligent agent\n", + "banking_agent = create_banking_langgraph_agent()\n", + "\n", + "print(\"Banking LangGraph Agent Created Successfully!\")\n", + "print(\"\\nFeatures:\")\n", + "print(\" - Intelligent banking tool selection\")\n", + "print(\" - Comprehensive banking system prompt\")\n", + "print(\" - Streamlined workflow: LLM → Tools → Response\")\n", + "print(\" - Automatic tool parameter extraction\")\n", + "print(\" - Professional banking assistance\")" + ], + "execution_count": null, + "outputs": [], + "id": "455b8ee4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Integrate agent with ValidMind\n", + "\n", + "To integrate our LangGraph banking agent with ValidMind, we need to create a wrapper function that ValidMind can use to invoke the agent and extract the necessary information for testing and documentation, allowing ValidMind to run validation tests on the agent's behavior, tool usage, and responses." + ], + "id": "e00dac77" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Import ValidMind components\n", + "\n", + "We'll start with importing the necessary ValidMind components for integrating our agent:\n", + "\n", + "- `Prompt` from `validmind.models` for handling prompt-based model inputs\n", + "- `extract_tool_calls_from_agent_output` and `_convert_to_tool_call_list` from `validmind.scorers.llm.deepeval` for extracting and converting tool calls from agent outputs" + ], + "id": "a124857e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.models import Prompt\n", + "from validmind.scorers.llm.deepeval import extract_tool_calls_from_agent_output, _convert_to_tool_call_list\n", + "from deepeval.tracing import observe, update_current_span\n", + "from deepeval.test_case import LLMTestCase" + ], + "execution_count": null, + "outputs": [], + "id": "9aeb8969" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_2__'></a>\n", + "\n", + "#### Create agent wrapper function\n", + "\n", + "We'll then create a wrapper function that:\n", + "\n", + "- Accepts input in ValidMind's expected format (with `input` and `session_id` fields)\n", + "- Invokes the banking agent with the proper state initialization\n", + "- Captures tool outputs and tool calls for evaluation\n", + "- Returns a standardized response format that includes the prediction, full output, tool messages, and tool call information\n", + "- Handles errors gracefully with fallback responses" + ], + "id": "ed72903f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@observe(type=\"agent\")\n", + "def banking_agent_fn(input):\n", + " \"\"\"\n", + " Invoke the banking agent with the given input.\n", + " \"\"\"\n", + " try:\n", + " # Initial state for banking agent\n", + " initial_state = {\n", + " \"user_input\": input[\"input\"],\n", + " \"messages\": [HumanMessage(content=input[\"input\"])],\n", + " \"session_id\": input[\"session_id\"],\n", + " \"context\": {}\n", + " }\n", + " session_config = {\"configurable\": {\"thread_id\": input[\"session_id\"]}}\n", + " result = banking_agent.invoke(initial_state, config=session_config)\n", + "\n", + " from utils import capture_tool_output_messages\n", + "\n", + " # Capture all tool outputs and metadata\n", + " captured_data = capture_tool_output_messages(result)\n", + " \n", + " # Access specific tool outputs, this will be used for RAGAS tests\n", + " tool_message = \"\"\n", + " for output in captured_data[\"tool_outputs\"]:\n", + " tool_message += output['content']\n", + " \n", + " tool_calls_found = []\n", + " messages = result['messages']\n", + " for message in messages:\n", + " if hasattr(message, 'tool_calls') and message.tool_calls:\n", + " for tool_call in message.tool_calls:\n", + " # Handle both dictionary and object formats\n", + " if isinstance(tool_call, dict):\n", + " tool_calls_found.append(tool_call['name'])\n", + " else:\n", + " # ToolCall object - use attribute access\n", + " tool_calls_found.append(tool_call.name)\n", + "\n", + " prediction_text = result['messages'][-1].content[0]['text']\n", + " tools_called_value = _convert_to_tool_call_list(extract_tool_calls_from_agent_output(result))\n", + " expected_tools_value = _convert_to_tool_call_list(input.get(\"expected_tools\", []))\n", + "\n", + " # Feed trace data for DeepEval metrics (e.g. PlanQuality) that require tracing\n", + " update_current_span(\n", + " test_case=LLMTestCase(\n", + " input=input[\"input\"],\n", + " actual_output=prediction_text,\n", + " tools_called=tools_called_value,\n", + " expected_tools=expected_tools_value\n", + " )\n", + " )\n", + "\n", + " return {\n", + " \"prediction\": prediction_text,\n", + " \"output\": result,\n", + " \"tool_messages\": [tool_message],\n", + " # \"tool_calls\": tool_calls_found,\n", + " \"tool_called\": tools_called_value\n", + " }\n", + " except Exception as e:\n", + " # Return a fallback response if the agent fails\n", + " error_message = f\"\"\"I apologize, but I encountered an error while processing your banking request: {str(e)}.\n", + " Please try rephrasing your question or contact support if the issue persists.\"\"\"\n", + " return {\n", + " \"prediction\": error_message, \n", + " \"output\": {\n", + " \"messages\": [HumanMessage(content=input[\"input\"]), SystemMessage(content=error_message)],\n", + " \"error\": str(e)\n", + " }\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "0e4d5a82" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_3__'></a>\n", + "\n", + "#### Initialize the ValidMind model\n", + "\n", + "We'll also need to register the banking agent as a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model) which:\n", + "\n", + "- Associates the wrapper function with the model for prediction\n", + "- Stores the system prompt template for documentation\n", + "- Provides a unique `input_id` for tracking and identification\n", + "- Enables the agent to be used with ValidMind's testing and documentation features" + ], + "id": "fda87401" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the agent as a model\n", + "vm_banking_model = vm.init_model(\n", + " input_id=\"banking_agent_model\",\n", + " predict_fn=banking_agent_fn,\n", + " prompt=Prompt(template=system_context)\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "60a2ce7a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_4__'></a>\n", + "\n", + "#### Store the agent reference\n", + "\n", + "We'll also store a reference to the original banking agent object in the ValidMind model. This allows us to access the full agent functionality directly if needed, while still maintaining the wrapper function interface for ValidMind's testing framework." + ], + "id": "949bcf53" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Add the banking agent to the vm model\n", + "vm_banking_model.model = banking_agent" + ], + "execution_count": null, + "outputs": [], + "id": "2c653471" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_5__'></a>\n", + "\n", + "#### Verify integration\n", + "\n", + "Let's confirm that the banking agent has been successfully integrated with ValidMind:" + ], + "id": "d8d0c1c1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"Banking Agent Successfully Integrated with ValidMind!\")\n", + "print(f\"Model ID: {vm_banking_model.input_id}\")" + ], + "execution_count": null, + "outputs": [], + "id": "8e101b0f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Validate the system prompt\n", + "\n", + "Let's get an initial sense of how well our defined system prompt meets a few best practices for prompt engineering by running a few tests — we'll run evaluation tests later on our agent's performance.\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. Passing in our agentic model as an input, the tests below rate the prompt on a scale of 1-10 against the following criteria:\n", + "\n", + "- **[Clarity](https://docs.validmind.ai/tests/prompt_validation/Clarity.html)** — How clearly the prompt states the task.\n", + "- **[Conciseness](https://docs.validmind.ai/tests/prompt_validation/Conciseness.html)** — How succinctly the prompt states the task.\n", + "- **[Delimitation](https://docs.validmind.ai/tests/prompt_validation/Delimitation.html)** — When using complex prompts containing examples, contextual information, or other elements, is the prompt formatted in such a way that each element is clearly separated?\n", + "- **[NegativeInstruction](https://docs.validmind.ai/tests/prompt_validation/NegativeInstruction.html)** — Whether the prompt contains negative instructions.\n", + "- **[Specificity](https://docs.validmind.ai/tests/prompt_validation/NegativeInstruction.html)** — How specific the prompt defines the task." + ], + "id": "2a5f874e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Clarity\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "f52dceb1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Conciseness\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "70d52333" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Delimitation\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "5aa89976" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.NegativeInstruction\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "8630197e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Specificity\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "bba99915" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initializing the ValidMind dataset\n", + "\n", + "After validation our system prompt, let's import our sample dataset ([banking_test_dataset.py](banking_test_dataset.py)), which we'll use in the next section to evaluate our agent's performance across different banking scenarios:" + ], + "id": "51d61141" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from banking_test_dataset import banking_test_dataset" + ], + "execution_count": null, + "outputs": [], + "id": "0c70ca2c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`text_column`** — The name of the column containing the text input data.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "442ab66d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset = vm.init_dataset(\n", + " input_id=\"banking_test_dataset\",\n", + " dataset=banking_test_dataset,\n", + " text_column=\"input\",\n", + " target_column=\"possible_outputs\",\n", + ")\n", + "\n", + "print(\"Banking Test Dataset Initialized in ValidMind!\")\n", + "print(f\"Dataset ID: {vm_test_dataset.input_id}\")\n", + "print(f\"Dataset columns: {vm_test_dataset._df.columns}\")\n", + "vm_test_dataset._df" + ], + "execution_count": null, + "outputs": [], + "id": "a7e9d158" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Now that both the model object and the datasets have been registered, we'll assign predictions to capture the banking agent's responses for evaluation:\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "7b01021c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_predictions(vm_banking_model)\n", + "\n", + "print(\"Banking Agent Predictions Generated Successfully!\")\n", + "print(f\"Predictions assigned to {len(vm_test_dataset._df)} test cases\")\n", + "vm_test_dataset._df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "1d462663" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running accuracy tests\n", + "\n", + "Using [`@vm.test`](https://docs.validmind.ai/validmind/validmind.html#test), let's implement some reusable custom *inline tests* to assess the accuracy of our banking agent:\n", + "\n", + "- An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", + "- You'll note that the custom test functions are just regular Python functions that can include and require any Python library as you see fit." + ], + "id": "4e56f556" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Response accuracy test\n", + "\n", + "We'll create a custom test that evaluates the banking agent's ability to provide accurate responses by:\n", + "\n", + "- Testing against a dataset of predefined banking questions and expected answers.\n", + "- Checking if responses contain expected keywords and banking terminology.\n", + "- Providing detailed test results including pass/fail status.\n", + "- Helping identify any gaps in the agent's banking knowledge or response quality." + ], + "id": "1bce9258" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "@vm.test(\"my_custom_tests.banking_accuracy_test\")\n", + "def banking_accuracy_test(model, dataset, list_of_columns):\n", + " \"\"\"\n", + " The Banking Accuracy Test evaluates whether the agent’s responses include \n", + " critical domain-specific keywords and phrases that indicate accurate, compliant,\n", + " and contextually appropriate banking information. This test ensures that the agent\n", + " provides responses containing the expected banking terminology, risk classifications,\n", + " account details, or other domain-relevant information required for regulatory compliance,\n", + " customer safety, and operational accuracy.\n", + " \"\"\"\n", + " df = dataset._df\n", + " \n", + " # Pre-compute responses for all tests\n", + " y_true = dataset.y.tolist()\n", + " y_pred = dataset.y_pred(model).tolist()\n", + "\n", + " # Vectorized test results\n", + " test_results = []\n", + " for response, keywords in zip(y_pred, y_true):\n", + " # Convert keywords to list if not already a list\n", + " if not isinstance(keywords, list):\n", + " keywords = [keywords]\n", + " test_results.append(any(str(keyword).lower() in str(response).lower() for keyword in keywords))\n", + " \n", + " results = pd.DataFrame()\n", + " column_names = [col + \"_details\" for col in list_of_columns]\n", + " results[column_names] = df[list_of_columns]\n", + " results[\"actual\"] = y_pred\n", + " results[\"expected\"] = y_true\n", + " results[\"passed\"] = test_results\n", + " results[\"error\"] = None if test_results else f'Response did not contain any expected keywords: {y_true}'\n", + " \n", + " return results" + ], + "execution_count": null, + "outputs": [], + "id": "90232066" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we've defined our custom response accuracy test, we can run the test using the same `run_test()` function we used earlier to validate the system prompt using our sample dataset and agentic model as input, and log the test results to the ValidMind Platform with the [`log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#log):" + ], + "id": "2a7f71f8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.banking_accuracy_test\",\n", + " inputs={\n", + " \"dataset\": vm_test_dataset,\n", + " \"model\": vm_banking_model\n", + " },\n", + " params={\n", + " \"list_of_columns\": [\"input\"]\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e68884d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's review the first five rows of the test dataset to inspect the results to see how well the banking agent performed. Each column in the output serves a specific purpose in evaluating agent performance:\n", + "\n", + "| Column header | Description | Importance |\n", + "|--------------|-------------|------------|\n", + "| **`input`** | Original user query or request | Essential for understanding the context of each test case and tracing which inputs led to specific agent behaviors. |\n", + "| **`expected_tools`** | Banking tools that should be invoked for this request | Enables validation of correct tool selection, which is critical for agentic AI systems where choosing the right tool is a key success metric. |\n", + "| **`expected_output`** | Expected output or keywords that should appear in the response | Defines the success criteria for each test case, enabling objective evaluation of whether the agent produced the correct result. |\n", + "| **`session_id`** | Unique identifier for each test session | Allows tracking and correlation of related test runs, debugging specific sessions, and maintaining audit trails. |\n", + "| **`category`** | Classification of the request type | Helps organize test results by domain and identify performance patterns across different banking use cases. |\n", + "| **`banking_agent_model_output`** | Complete agent response including all messages and reasoning | Allows you to examine the full output to assess response quality, completeness, and correctness beyond just keyword matching. |\n", + "| **`banking_agent_model_tool_messages`** | Messages exchanged with the banking tools | Critical for understanding how the agent interacted with tools, what parameters were passed, and what tool outputs were received. |\n", + "| **`banking_agent_model_tool_called`** | Specific tool that was invoked | Enables validation that the agent selected the correct tool for each request, which is fundamental to agentic AI validation. |\n", + "| **`possible_outputs`** | Alternative valid outputs or keywords that could appear in the response | Provides flexibility in evaluation by accounting for multiple acceptable response formats or variations. |" + ], + "id": "94a717e7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.df.head(5)" + ], + "execution_count": null, + "outputs": [], + "id": "78f7edb1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Tool selection accuracy test\n", + "\n", + "We'll also create a custom test that evaluates the banking agent's ability to select the correct tools for different requests by:\n", + "\n", + "- Testing against a dataset of predefined banking queries with expected tool selections.\n", + "- Comparing the tools actually invoked by the agent against the expected tools for each request.\n", + "- Providing quantitative accuracy scores that measure the proportion of expected tools correctly selected.\n", + "- Helping identify gaps in the agent's understanding of user needs and tool selection logic." + ], + "id": "1cb3e8bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we'll define a helper function that extracts tool calls from the agent's messages and compares them against the expected tools. This function handles different message formats (dictionary or object) and calculates accuracy scores:" + ], + "id": "69263d62" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def validate_tool_calls_simple(messages, expected_tools):\n", + " \"\"\"Simple validation of tool calls without RAGAS dependency issues.\"\"\"\n", + " \n", + " tool_calls_found = []\n", + " \n", + " for message in messages:\n", + " if hasattr(message, 'tool_calls') and message.tool_calls:\n", + " for tool_call in message.tool_calls:\n", + " # Handle both dictionary and object formats\n", + " if isinstance(tool_call, dict):\n", + " tool_calls_found.append(tool_call['name'])\n", + " else:\n", + " # ToolCall object - use attribute access\n", + " tool_calls_found.append(tool_call.name)\n", + " \n", + " # Check if expected tools were called\n", + " accuracy = 0.0\n", + " matches = 0\n", + " if expected_tools:\n", + " matches = sum(1 for tool in expected_tools if tool in tool_calls_found)\n", + " accuracy = matches / len(expected_tools)\n", + " \n", + " return {\n", + " 'expected_tools': expected_tools,\n", + " 'found_tools': tool_calls_found,\n", + " 'matches': matches,\n", + " 'total_expected': len(expected_tools) if expected_tools else 0,\n", + " 'accuracy': accuracy,\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "e68798be" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we'll define the main test function that uses the helper function to evaluate tool selection accuracy across all test cases in the dataset:" + ], + "id": "8f494fd3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BankingToolCallAccuracy\")\n", + "def BankingToolCallAccuracy(dataset, agent_output_column, expected_tools_column):\n", + " \"\"\"\n", + " Evaluates the tool selection accuracy of a LangGraph-powered banking agent.\n", + "\n", + " This test measures whether the agent correctly identifies and invokes the required banking tools\n", + " for each user query scenario.\n", + " For each case, the outputs generated by the agent (including its tool calls) are compared against an\n", + " expected set of tools. The test considers both coverage and exactness: it computes the proportion of\n", + " expected tools correctly called by the agent for each instance.\n", + "\n", + " Parameters:\n", + " dataset (VMDataset): The dataset containing user queries, agent outputs, and ground-truth tool expectations.\n", + " agent_output_column (str): Dataset column name containing agent outputs (should include tool call details in 'messages').\n", + " expected_tools_column (str): Dataset column specifying the true expected tools (as lists).\n", + "\n", + " Returns:\n", + " List[dict]: Per-row dictionaries with details: expected tools, found tools, match count, total expected, and accuracy score.\n", + "\n", + " Purpose:\n", + " Provides diagnostic evidence of the banking agent's core reasoning ability—specifically, its capacity to\n", + " interpret user needs and select the correct banking actions. Useful for diagnosing gaps in tool coverage,\n", + " misclassifications, or breakdowns in agent logic.\n", + "\n", + " Interpretation:\n", + " - An accuracy of 1.0 signals perfect tool selection for that example.\n", + " - Lower scores may indicate partial or complete failures to invoke required tools.\n", + " - Review 'found_tools' vs. 'expected_tools' to understand the source of discrepancies.\n", + "\n", + " Strengths:\n", + " - Directly tests a core capability of compositional tool-use agents.\n", + " - Framework-agnostic; robust to tool call output format (object or dict).\n", + " - Supports batch validation and result logging for systematic documentation.\n", + "\n", + " Limitations:\n", + " - Does not penalize extra, unnecessary tool calls.\n", + " - Does not assess result quality—only correct invocation.\n", + "\n", + " \"\"\"\n", + " df = dataset._df\n", + " \n", + " results = []\n", + " for i, row in df.iterrows():\n", + " result = validate_tool_calls_simple(row[agent_output_column]['messages'], row[expected_tools_column])\n", + " results.append(result)\n", + " \n", + " return results" + ], + "execution_count": null, + "outputs": [], + "id": "604d7313" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we can call our function with `run_test()` and log the test results to the ValidMind Platform:" + ], + "id": "57ab606b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.BankingToolCallAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_test_dataset,\n", + " },\n", + " params={\n", + " \"agent_output_column\": \"banking_agent_model_output\",\n", + " \"expected_tools_column\": \"expected_tools\"\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "dd14115e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Assigning AI evaluation metric scores\n", + "\n", + "*AI agent evaluation metrics* are specialized measurements designed to assess how well autonomous LLM-based agents reason, plan, select and execute tools, and ultimately complete user tasks by analyzing the *full execution trace* — including reasoning steps, tool calls, intermediate decisions, and outcomes, rather than just single input–output pairs. These metrics are essential because agent failures often occur in ways traditional LLM metrics miss — for example, choosing the right tool with wrong arguments, creating a good plan but not following it, or completing a task inefficiently.\n", + "\n", + "In this section, we'll evaluate our banking agent's outputs and add scoring to our sample dataset against metrics defined in [DeepEval’s AI agent evaluation framework](https://deepeval.com/guides/guides-ai-agent-evaluation-metrics) which breaks down AI agent evaluation into three layers with corresponding subcategories: **reasoning**, **action**, and **execution**.\n", + "\n", + "Together, these three metrics enable granular diagnosis of agent behavior, help pinpoint where failures occur (reasoning, action, or execution), and support both development benchmarking and production monitoring." + ], + "id": "be8d5270" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify relevant DeepEval scorers\n", + "\n", + "*Scorers* are evaluation metrics that analyze model outputs and store their results in the dataset:\n", + "\n", + "- Each scorer adds a new column to the dataset with format: `{scorer_name}_{metric_name}`\n", + "- The column contains the numeric score (typically `0`-`1`) for each example\n", + "- Multiple scorers can be run on the same dataset, each adding their own column\n", + "- Scores are persisted in the dataset for later analysis and visualization\n", + "- Common scorer patterns include:\n", + " - Model performance metrics (accuracy, F1, etc.)\n", + " - Output quality metrics (relevance, faithfulness)\n", + " - Task-specific metrics (completion, correctness)\n", + "\n", + "Use `list_scorers()` from [`validmind.scorers`](https://docs.validmind.ai/validmind/validmind/tests.html#scorer) to discover all available scoring methods and their IDs that can be used with `assign_scores()`. We'll filter these results to return only DeepEval scorers for our desired three metrics in a formatted table with descriptions:" + ], + "id": "25828bef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load all DeepEval scorers\n", + "llm_scorers_dict = vm.tests.load._load_tests([s for s in vm.scorer.list_scorers() if \"deepeval\" in s.lower()])\n", + "\n", + "# Categorize scorers by metric layer\n", + "reasoning_scorers = {}\n", + "action_scorers = {}\n", + "execution_scorers = {}\n", + "\n", + "for scorer_id, scorer_func in llm_scorers_dict.items():\n", + " tags = getattr(scorer_func, \"__tags__\", [])\n", + " scorer_name = scorer_id.split(\".\")[-1]\n", + "\n", + " if \"reasoning_layer\" in tags:\n", + " reasoning_scorers[scorer_id] = scorer_func\n", + " elif \"action_layer\" in tags:\n", + " action_scorers[scorer_id] = scorer_func\n", + " elif \"TaskCompletion\" in scorer_name:\n", + " execution_scorers[scorer_id] = scorer_func\n", + "\n", + "# Display scorers by category\n", + "print(\"=\" * 80)\n", + "print(\"REASONING LAYER\")\n", + "print(\"=\" * 80)\n", + "if reasoning_scorers:\n", + " reasoning_df = vm.tests.load._pretty_list_tests(reasoning_scorers, truncate=True)\n", + " display(reasoning_df)\n", + "else:\n", + " print(\"No reasoning layer scorers found.\")\n", + "\n", + "print(\"\\n\" + \"=\" * 80)\n", + "print(\"ACTION LAYER\")\n", + "print(\"=\" * 80)\n", + "if action_scorers:\n", + " action_df = vm.tests.load._pretty_list_tests(action_scorers, truncate=True)\n", + " display(action_df)\n", + "else:\n", + " print(\"No action layer scorers found.\")\n", + "\n", + "print(\"\\n\" + \"=\" * 80)\n", + "print(\"EXECUTION LAYER\")\n", + "print(\"=\" * 80)\n", + "if execution_scorers:\n", + " execution_df = vm.tests.load._pretty_list_tests(execution_scorers, truncate=True)\n", + " display(execution_df)\n", + "else:\n", + " print(\"No execution layer scorers found.\")" + ], + "execution_count": null, + "outputs": [], + "id": "730c70ec" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Assign reasoning scores\n", + "\n", + "*Reasoning* evaluates planning and strategy generation:\n", + "\n", + "- **Plan quality** – How logical, complete, and efficient the agent’s plan is.\n", + "- **Plan adherence** – Whether the agent follows its own plan during execution." + ], + "id": "e5fb739b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_1__'></a>\n", + "\n", + "#### Plan quality score\n", + "\n", + "Let's measure how well our banking agent generates a plan before acting. A high score means the plan is logical, complete, and efficient." + ], + "id": "fde94d01" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.PlanQuality\",\n", + " model = vm_banking_model,\n", + " input_column = \"input\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_PlanQuality_score','banking_agent_model_PlanQuality_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "52f362ba" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_2__'></a>\n", + "\n", + "#### Plan adherence score\n", + "\n", + "Let's check whether our banking agent follows the plan it created. Deviations lower this score and indicate gaps between reasoning and execution." + ], + "id": "d631fd12" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.PlanAdherence\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_PlanAdherence_score','banking_agent_model_PlanAdherence_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "4124a7c2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign action scores\n", + "\n", + "*Action* assesses tool usage and argument generation:\n", + "\n", + "- **Tool correctness** – Whether the agent selects and calls the right tools.\n", + "- **Argument correctness** – Whether the agent generates correct tool arguments." + ], + "id": "82e5e6f1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3_1__'></a>\n", + "\n", + "#### Tool correctness score\n", + "\n", + "Let's evaluate if our banking agent selects the appropriate tool for the task. Choosing the wrong tool reduces performance even if reasoning was correct." + ], + "id": "e641c9f2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.ToolCorrectness\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " expected_tools_called_column = \"expected_tools\",\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_ToolCorrectness_score','banking_agent_model_ToolCorrectness_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "8d2e8a25" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3_2__'></a>\n", + "\n", + "#### Argument correctness score\n", + "\n", + "Let's assesses whether our banking agent provides correct inputs or arguments to the selected tool. Incorrect arguments can lead to failed or unexpected results." + ], + "id": "dd758ba5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.ArgumentCorrectness\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_ArgumentCorrectness_score','banking_agent_model_ArgumentCorrectness_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "04f90489" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Assign execution score\n", + "\n", + "*Execution* measures end-to-end performance:\n", + "\n", + "- **Task completion** – Whether the agent successfully completes the intended task." + ], + "id": "1aeec2f5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_1__'></a>\n", + "\n", + "#### Task completion score\n", + "\n", + "Let's evaluate whether our banking agent successfully completes the requested tasks. Incomplete task execution can lead to user dissatisfaction and failed banking operations." + ], + "id": "eb9ab8de" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.TaskCompletion\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_TaskCompletion_score','banking_agent_model_TaskCompletion_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "05024f1f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you recall from the beginning of this section, when we run scorers through `assign_scores()`, the return values are automatically processed and added as new columns with the format `{scorer_name}_{metric_name}`. Note that the task completion scorer has added a new column `TaskCompletion_score` to our dataset.\n", + "\n", + "We'll use this column to visualize the distribution of task completion scores across our test cases through the [BoxPlot test](https://docs.validmind.ai/validmind/validmind/tests/plots/BoxPlot.html#boxplot):" + ], + "id": "b577c282" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.plots.BoxPlot\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " params={\n", + " \"columns\": \"banking_agent_model_TaskCompletion_score\",\n", + " \"title\": \"Distribution of Task Completion Scores\",\n", + " \"ylabel\": \"Score\",\n", + " \"figsize\": (8, 6)\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "7f6d08ca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Running RAGAS tests\n", + "\n", + "Next, let's run some out-of-the-box *Retrieval-Augmented Generation Assessment* (RAGAS) tests available in the ValidMind Library. RAGAS provides specialized metrics for evaluating retrieval-augmented generation systems and conversational AI agents. These metrics analyze different aspects of agent performance by assessing how well systems integrate retrieved information with generated responses.\n", + "\n", + "Our banking agent uses tools to retrieve information and generates responses based on that context, making it similar to a RAG system. RAGAS metrics help evaluate the quality of this integration by analyzing the relationship between retrieved tool outputs, user queries, and generated responses.\n", + "\n", + "These tests provide insights into how well our banking agent integrates tool usage with conversational abilities, ensuring it provides accurate, relevant, and helpful responses to banking users while maintaining fidelity to retrieved information." + ], + "id": "30d9ec62" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Identify relevant RAGAS tests\n", + "\n", + "Let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your development testing, and helps you ensure that your records are being documented and evaluated appropriately.\n", + "\n", + "You can pass `tasks` and `tags` as parameters to the [`vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to filter the tests based on the tags and task types:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `text_qa` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `ragas` tag.\n", + "\n", + "We'll then run three of these tests returned as examples below." + ], + "id": "8288f6c3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(task=\"text_qa\", tags=[\"ragas\"])" + ], + "execution_count": null, + "outputs": [], + "id": "0701f5a9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_1__'></a>\n", + "\n", + "#### Faithfulness\n", + "\n", + "Let's evaluate whether the banking agent's responses accurately reflect the information retrieved from tools. Unfaithful responses can misreport credit analysis, financial calculations, and compliance results—undermining user trust in the banking agent." + ], + "id": "2ce24ba0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.Faithfulness\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"response_column\": [\"banking_agent_model_prediction\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "92044533" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_2__'></a>\n", + "\n", + "#### Response Relevancy\n", + "\n", + "Let's evaluate whether the banking agent's answers address the user's original question or request. Irrelevant or off-topic responses can frustrate users and fail to deliver the banking information they need." + ], + "id": "4d1fcfcd" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.ResponseRelevancy\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " params={\n", + " \"user_input_column\": \"input\",\n", + " \"response_column\": \"banking_agent_model_prediction\",\n", + " \"retrieved_contexts_column\": \"banking_agent_model_tool_messages\",\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "d7483bc3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_3__'></a>\n", + "\n", + "#### Context Recall\n", + "\n", + "Let's evaluate how well the banking agent uses the information retrieved from tools when generating its responses. Poor context recall can lead to incomplete or underinformed answers even when the right tools were selected." + ], + "id": "38c1dfb5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.ContextRecall\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " \"reference_column\": [\"banking_agent_model_prediction\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "e5dc00ce" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Running safety tests\n", + "\n", + "Finally, let's run some out-of-the-box *safety* tests available in the ValidMind Library. Safety tests provide specialized metrics for evaluating whether AI agents operate reliably and securely. These metrics analyze different aspects of agent behavior by assessing adherence to safety guidelines, consistency of outputs, and resistance to harmful or inappropriate requests.\n", + "\n", + "Our banking agent handles sensitive financial information and user requests, making safety and reliability essential. Safety tests help evaluate whether the agent maintains appropriate boundaries, responds consistently and correctly to inputs, and avoids generating harmful, biased, or unprofessional content.\n", + "\n", + "These tests provide insights into how well our banking agent upholds standards of fairness and professionalism, ensuring it operates reliably and securely for banking users." + ], + "id": "95e1e16a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1_1__'></a>\n", + "\n", + "#### AspectCritic\n", + "\n", + "Let's evaluate our banking agent's responses across multiple quality dimensions — conciseness, coherence, correctness, harmfulness, and maliciousness. Weak performance on these dimensions can degrade user experience, fall short of professional banking standards, or introduce safety risks. \n", + "\n", + "We'll use the `AspectCritic` we identified earlier:" + ], + "id": "e0972afa" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.AspectCritic\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"response_column\": [\"banking_agent_model_prediction\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "148daa2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1_2__'></a>\n", + "\n", + "#### Bias\n", + "\n", + "Let's evaluate whether our banking agent's prompts contain unintended biases that could affect banking decisions. Biased prompts can lead to unfair or discriminatory outcomes — undermining customer trust and exposing the institution to compliance risk.\n", + "\n", + "We'll first use `list_tests()` again to filter for tests relating to `prompt_validation`:" + ], + "id": "16f29c8d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"prompt_validation\")" + ], + "execution_count": null, + "outputs": [], + "id": "74eba86c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And then run the identified `Bias` test:" + ], + "id": "e9413803" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Bias\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "062cf8e7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ], + "id": "8f3f2dbe" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "3. Click into any section related to the tests we ran in this notebook, for example: **4.3. Prompt Evaluation** to review the results of the tests we logged." + ], + "id": "8716165d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_2__'></a>\n", + "\n", + "### Customize the banking agent for your use case\n", + "\n", + "You've now built an agentic AI system designed for banking use cases that supports compliance with supervisory guidance such as SR 11-7 and SS1/23, covering credit and fraud risk assessment for both retail and commercial banking. Extend this example agent to real-world banking scenarios and production deployment by:\n", + "\n", + "- Adapting the banking tools to your organization's specific requirements\n", + "- Adding more banking scenarios and edge cases to your test set\n", + "- Connecting the agent to your banking systems and databases\n", + "- Implementing additional banking-specific tools and workflows" + ], + "id": "7c4a78ce" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_3__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "Learn more about the ValidMind Library tools we used in this notebook:\n", + "\n", + "- [Custom prompts](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html)\n", + "- [Custom tests](https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html)\n", + "- [ValidMind scorers](https://docs.validmind.ai/notebooks/how_to/scoring/assign_scores_complete_tutorial.html)\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "7f9385d3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "fdd5c0db" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "9733adff" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "829429fd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "55339760" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-b9e82bcf4e364c4f8e5ae4bb0e4b2865" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index a7ee3c372..a007317e7 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -1,2109 +1,2115 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "87056cee", - "metadata": {}, - "source": [ - "# Quickstart for knockout option pricing model documentation\n", - "\n", - "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", - "\n", - "A knockout option is a barrier option that ceases to exist if the underlying asset hits a predetermined price, known as the \"barrier.\" This barrier level, set above or below the current market price, determines whether the option will \"knock out\" before its expiration date. There are two types: \"up-and-out\" and \"down-and-out.\" In an up-and-out knockout option, the option expires if the asset price rises above the barrier, while in a down-and-out, it expires if the asset price falls below. Knockout options generally offer a lower premium than standard options since there is a chance they will expire worthless if the barrier is reached.\n", - "\n", - "Pricing knockout options involves accounting for the proximity of the asset's price to the barrier, as well as market volatility and the option’s time to expiration. High volatility and longer expiry increase the likelihood of the barrier being triggered, which reduces the option’s value. Models like modified Black-Scholes are used for simpler cases, while Monte Carlo simulations or binomial trees handle complex scenarios. Knockout options are useful for hedging or cost-effective investment strategies, allowing investors to save on premiums but with the risk of losing the option entirely if the barrier is hit.\n", - "\n", - "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." - ] - }, - { - "cell_type": "markdown", - "id": "7417dfe1", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Model development](#toc3__) \n", - "- [Data Preparation](#toc4__) \n", - " - [Synthetic data generation](#toc4_1__) \n", - " - [Initialize the ValidMind datasets](#toc4_2__) \n", - " - [Data Quality](#toc4_3__) \n", - " - [Outliers detection using IQR method](#toc4_3_1__) \n", - " - [Isolation Forest Outliers Test](#toc4_3_2__) \n", - " - [Model Calibration](#toc4_4__) \n", - " - [Synthetic Data Calibration Test](#toc4_5__) \n", - " - [Model Evaluation](#toc4_6__) \n", - " - [Benchmark Testing](#toc4_6_1__) \n", - " - [Sensitivity Testing](#toc4_6_2__) \n", - " - [Greeks](#toc4_6_3__) \n", - " - [Delta](#toc4_7__) \n", - " - [Gamma](#toc4_8__) \n", - " - [Theta](#toc4_9__) \n", - " - [Vega](#toc4_10__) \n", - " - [Rho](#toc4_11__) \n", - " - [Stress Testing](#toc4_11_1__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your model documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "1426d212", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "f8812717", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "b792f6a9", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c3d26e61", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "f3db6c9b", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "e1865b8d", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "214572ff", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Capital markets`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "8b9547ad", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0cc9c04c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e928f7e5", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9edb42a2", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "import pandas as pd\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from scipy.optimize import minimize\n", - "\n", - "from validmind.tests import run_test" - ] - }, - { - "cell_type": "markdown", - "id": "a2403294", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3dfd04dd", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "d79d9953", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Model development" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "id": "c3f5b0b9", - "metadata": {}, - "outputs": [], - "source": [ - "class OptionPricing:\n", - " def __init__(self, S0, K, T, r):\n", - " self.S0 = S0\n", - " self.K = K\n", - " self.T = T\n", - " self.r = r\n", - "\n", - " def monte_carlo_simulation(self, N, M):\n", - " raise NotImplementedError(\"Must be implemented by subclasses\")\n", - "\n", - " def price_option(self, N, M):\n", - " raise NotImplementedError(\"Must be implemented by subclasses\")\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a9d7f832", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "class BlackScholesModel(OptionPricing):\n", - " def __init__(self, S0, K, T, r, sigma):\n", - " super().__init__(S0, K, T, r)\n", - " self.sigma = sigma\n", - " def monte_carlo_simulation(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " price_paths[:, 0] = self.S0\n", - " for t in range(1, M + 1):\n", - " Z = np.random.standard_normal(N)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * self.sigma**2) * dt + self.sigma * np.sqrt(dt) * Z)\n", - " return price_paths\n", - "\n", - " def price_option(self, N, M):\n", - " price_paths = self.monte_carlo_simulation(N, M)\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", - " \n", - " def calibrate(self, market_prices, strikes, maturities):\n", - " def objective_function(params):\n", - " self.sigma = params[0]\n", - " for K, T in zip(strikes, maturities):\n", - " self.K = K\n", - " self.T = T\n", - " model_prices.append(self.price_option(10000, 100))\n", - " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", - " result = minimize(objective_function, [self.sigma], bounds=[(0.01, 1.0)])\n", - " self.sigma = result.x[0]\n", - "\n", - "class StochasticVolatilityModel(OptionPricing):\n", - " def __init__(self, S0, K, T, r, v0, kappa, theta, xi, rho):\n", - " super().__init__(S0, K, T, r)\n", - " self.v0 = v0\n", - " self.kappa = kappa\n", - " self.theta = theta\n", - " self.xi = xi\n", - " self.rho = rho\n", - " def monte_carlo_simulation(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " vol_paths = np.zeros((N, M + 1))\n", - " price_paths[:, 0] = self.S0\n", - " vol_paths[:, 0] = self.v0\n", - " for t in range(1, M + 1):\n", - " Z1 = np.random.standard_normal(N)\n", - " Z2 = np.random.standard_normal(N)\n", - " W1 = Z1\n", - " W2 = self.rho * Z1 + np.sqrt(1 - self.rho**2) * Z2\n", - " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.kappa * (self.theta - vol_paths[:, t - 1]) * dt + self.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2)\n", - " return price_paths\n", - "\n", - " def price_option(self, N, M):\n", - " price_paths = self.monte_carlo_simulation(N, M)\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", - " \n", - " def calibrate(self, market_prices, strikes, maturities):\n", - " def objective_function(params):\n", - " self.v0, self.kappa, self.theta, self.xi, self.rho = params\n", - " model_prices = []\n", - " for K, T in zip(strikes, maturities):\n", - " self.K = K\n", - " self.T = T\n", - " model_prices.append(self.price_option(10000, 100))\n", - "\n", - " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", - " \n", - " initial_guess = [self.v0, self.kappa, self.theta, self.xi, self.rho]\n", - " bounds = [(0.01, 1.0), (0.01, 5.0), (0.01, 1.0), (0.01, 1.0), (-1.0, 1.0)]\n", - " result = minimize(objective_function, initial_guess, bounds=bounds)\n", - " self.v0, self.kappa, self.theta, self.xi, self.rho = result.x\n", - "\n", - "\n", - "class KnockoutOption:\n", - " def __init__(self, model, S0, K, T, r, barrier):\n", - " self.model = model\n", - " self.S0 = S0\n", - " self.K = K\n", - " self.T = T\n", - " self.r = r\n", - " self.barrier = barrier\n", - "\n", - " def price_knockout_option(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " vol_paths = np.zeros((N, M + 1)) if isinstance(self.model, StochasticVolatilityModel) else None\n", - " price_paths[:, 0] = self.S0\n", - " if vol_paths is not None:\n", - " vol_paths[:, 0] = self.model.v0\n", - " \n", - " for t in range(1, M + 1):\n", - " Z1 = np.random.standard_normal(N)\n", - " if vol_paths is None:\n", - " # Black-Scholes Model\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", - " (self.r - 0.5 * self.model.sigma**2) * dt + self.model.sigma * np.sqrt(dt) * Z1\n", - " )\n", - " else:\n", - " # Stochastic Volatility Model\n", - " Z2 = np.random.standard_normal(N)\n", - " W1 = Z1\n", - " W2 = self.model.rho * Z1 + np.sqrt(1 - self.model.rho**2) * Z2\n", - " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.model.kappa * (self.model.theta - vol_paths[:, t - 1]) * dt + self.model.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", - " (self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2\n", - " )\n", - " \n", - " # Knockout condition\n", - " price_paths[:, t][price_paths[:, t] >= self.barrier] = 0\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)" - ] - }, - { - "cell_type": "markdown", - "id": "14bcdbb9", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Data Preparation" - ] - }, - { - "cell_type": "markdown", - "id": "f655dc9c", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Synthetic data generation" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "id": "42cb9070", - "metadata": {}, - "outputs": [], - "source": [ - "def generate_synthetic_market_data(model, strikes, maturities):\n", - " market_prices = []\n", - " market_data = []\n", - " for K, T in zip(strikes, maturities):\n", - " model.K = K\n", - " model.T = T\n", - " market_prices.append(model.price_option(10000, 100))\n", - " market_data.append({\"strike\": K, \"option_price\": model.price_option(10000, 100)})\n", - " return market_prices, market_data\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2854fbe3", - "metadata": {}, - "outputs": [], - "source": [ - "N = 10000\n", - "M = 100\n", - "\n", - "# Parameters for synthetic data\n", - "S0 = 100\n", - "K = 100\n", - "T = 1\n", - "r = 0.05\n", - "# BlackSholes\n", - "true_sigma = 0.2\n", - "\n", - "# Stochastic Volatility\n", - "true_v0 = 0.2\n", - "true_kappa = 2.0\n", - "true_theta = 0.2\n", - "true_xi = 0.1\n", - "true_rho = -0.5\n", - "\n", - "# Synthetic data generation parameters\n", - "strikes = list(np.linspace(75, 130, 25))\n", - "maturities = list(np.linspace(0.2, 3.0, 25))\n", - "\n", - "# Generate synthetic market data using the true parameters\n", - "bs_model = BlackScholesModel(S0, K, T, r, true_sigma)\n", - "bs_market_prices, bs_market_data = generate_synthetic_market_data(bs_model, strikes, maturities)\n", - "\n", - "\n", - "sv_model = StochasticVolatilityModel(S0, K, T, r, true_v0, true_kappa, true_theta, true_xi, true_rho)\n", - "sv_market_prices, sv_market_data = generate_synthetic_market_data(sv_model, strikes, maturities)\n" - ] - }, - { - "cell_type": "markdown", - "id": "b54c4950", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f3498dd", - "metadata": {}, - "outputs": [], - "source": [ - "bs_market_data_df = pd.DataFrame(bs_market_data)\n", - "vm_bs_market_data = vm.init_dataset(\n", - " dataset=bs_market_data_df,\n", - " input_id=\"sv_market_data\",\n", - ")\n", - "\n", - "sv_market_data_df = pd.DataFrame(sv_market_data)\n", - "vm_sv_market_data = vm.init_dataset(\n", - " dataset=sv_market_data_df,\n", - " input_id=\"sv_market_data\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "7b36b59c", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Data Quality\n", - "Let's check quality of the data using outliers and missing data tests." - ] - }, - { - "cell_type": "markdown", - "id": "671330b1", - "metadata": {}, - "source": [ - "<a id='toc4_3_1__'></a>\n", - "\n", - "#### Outliers detection using IQR method\n", - "Let's visualizes the distribution of outliers in the option_price feature using the Interquartile Range (IQR) method." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f1c1ab6f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers detection using IQR method for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6b5e8654", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersTable:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers table using IQR method for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d96f10c7", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers detection using IQR method for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "758c4c57", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersTable:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers table using IQR method for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "b1430200", - "metadata": {}, - "source": [ - "<a id='toc4_3_2__'></a>\n", - "\n", - "#### Isolation Forest Outliers Test\n", - "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9eb91453", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "12940f8e", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "f30e5579", - "metadata": {}, - "source": [ - "##### Missing Values Test\n", - "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "805ddb1c", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.MissingValues:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Missing Values detection for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e69e0039", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "result = run_test(\n", - " \"validmind.data_validation.MissingValues:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"MissingValues detection for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "09628809", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Model Calibration\n", - "* Clearly state the purpose of the calibration process. For example, in the context of an option pricing model, calibration aims to adjust model parameters to fit market data (e.g., market option prices, volatility surfaces).\n", - "* Specify whether the calibration is to historical data, current market data, or a blend of both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6802c26e", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.SyntheticDataCalibrationTest\")\n", - "def generate_synthetic_data_summary(option_pricing_model, strikes, maturities, synthetic_prices):\n", - " \"\"\"\n", - " This function will use synthetic prices to calibrate each model\n", - " and then generate derived prices based on the calibrated parameters.\n", - " It will output a DataFrame summarizing the strikes, maturities,\n", - " synthetic and derived prices, and the model parameters.\n", - "\n", - " \"\"\"\n", - " derived_prices = []\n", - " for K, T in zip(strikes, maturities):\n", - " option_pricing_model.K = K\n", - " option_pricing_model.T = T\n", - " derived_prices.append(option_pricing_model.price_option(10000, 100))\n", - " \n", - " model_type = type(option_pricing_model).__name__\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturity\": maturities,\n", - " \"Synthetic_Price\": synthetic_prices,\n", - " \"Derived_Price\": derived_prices,\n", - " \"Model_Type\": model_type,\n", - " \"S0\": [option_pricing_model.S0] * len(strikes),\n", - " \"K\": [option_pricing_model.K] * len(strikes),\n", - " \"T\": [option_pricing_model.T] * len(strikes),\n", - " \"r\": [option_pricing_model.r] * len(strikes)\n", - " }\n", - " \n", - " if model_type == \"BlackScholesModel\":\n", - " data[\"sigma\"] = [option_pricing_model.sigma] * len(strikes)\n", - " elif model_type == \"StochasticVolatilityModel\":\n", - " data[\"v0\"] = [option_pricing_model.v0] * len(strikes)\n", - " data[\"kappa\"] = [option_pricing_model.kappa] * len(strikes)\n", - " data[\"theta\"] = [option_pricing_model.theta] * len(strikes)\n", - " data[\"xi\"] = [option_pricing_model.xi] * len(strikes)\n", - " data[\"rho\"] = [option_pricing_model.rho] * len(strikes)\n", - " \n", - " df = pd.DataFrame(data)\n", - " return df\n" - ] - }, - { - "cell_type": "markdown", - "id": "3bf04d21", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Synthetic Data Calibration Test\n", - "Let's evaluates the accuracy of a stochastic volatility model by comparing synthetic prices with derived prices after model calibration." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4345cb5c", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.SyntheticDataCalibrationTest\",\n", - " params={\n", - " \"option_pricing_model\": sv_model,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " \"synthetic_prices\": sv_market_prices\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "4d48f107", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Model Evaluation" - ] - }, - { - "cell_type": "markdown", - "id": "8ec8b5a3", - "metadata": {}, - "source": [ - "<a id='toc4_6_1__'></a>\n", - "\n", - "#### Benchmark Testing\n", - "* Compare the model’s performance with alternative models or industry-standard models to assess its relative effectiveness.\n", - "* Ensure that the model is competitive in pricing, accuracy, and computational efficiency." - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "id": "ac733262", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", - "def benchmark_test(bs_model, sv_model, strikes, maturities):\n", - " \"\"\"\n", - " Comparison between Black Scholes and stochastic volatility model\n", - "\n", - " \"\"\"\n", - " bs_model_type = type(bs_model).__name__\n", - " sv_model_type = type(sv_model).__name__\n", - "\n", - " bs_derived_prices = []\n", - " sv_derived_prices = []\n", - " for K in strikes:\n", - " bs_model.K = K\n", - " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", - " sv_model.K = K\n", - " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", - "\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturities\": [sv_model.T] * len(strikes),\n", - " \"bs_model_price\": bs_derived_prices,\n", - " \"sv_model_price\": sv_derived_prices,\n", - "\n", - " }\n", - " df1 = pd.DataFrame(data)\n", - "\n", - " bs_derived_prices = []\n", - " sv_derived_prices = []\n", - " for T in maturities:\n", - " bs_model.T = T\n", - " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", - " sv_model.T = T\n", - " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", - "\n", - " data = {\n", - " \"Strike\": [sv_model.K] * len(maturities),\n", - " \"Maturities\": maturities,\n", - " \"bs_model_price\": bs_derived_prices,\n", - " \"sv_model_price\": sv_derived_prices,\n", - " }\n", - "\n", - " df2 = pd.DataFrame(data)\n", - "\n", - " return {\"strikes variation benchmarking\": df1}, {\"maturities variation benchmarking\": df2}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "20de9858", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.BenchmarkTest\",\n", - " params={\n", - " \"sv_model\": sv_model,\n", - " \"bs_model\": bs_model,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "d9ad15b8", - "metadata": {}, - "source": [ - "##### Surface Volatility Test\n", - "Let's calculates the implied volatility across different strikes and maturities based on market prices" - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "id": "46e275e3", - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "from scipy.optimize import minimize\n", - "import plotly.graph_objects as go\n", - "\n", - "@vm.test(\"my_custom_tests.ImpliedVolSurface\")\n", - "def implied_vol_surface(market_prices, strikes, maturities, S0, r, barrier, N=10000, M=100):\n", - " \"\"\"\n", - " This is a test to compute the implied volatility surface for a given set of market prices,\n", - " strikes, and maturities.\n", - " \"\"\"\n", - " def implied_volatility(market_price, N, M, initial_guess=0.2):\n", - " def objective_function(sigma):\n", - " model.sigma = sigma\n", - " model_price = model.price_option(N, M)\n", - " return (model_price - market_price) ** 2\n", - "\n", - " result = minimize(objective_function, initial_guess, bounds=[(0.01, 1.0)])\n", - " return result.x[0]\n", - " \n", - " implied_vols = np.zeros((len(strikes), len(maturities)))\n", - "\n", - " for i, K in enumerate(strikes):\n", - " for j, T in enumerate(maturities):\n", - " market_price = market_prices[i]\n", - " model = BlackScholesModel(S0, K, T, r, sigma=0.2)\n", - "\n", - " implied_vol = implied_volatility(market_price, N, M)\n", - " implied_vols[i, j] = implied_vol\n", - "\n", - " # Create the 3D surface plot\n", - " X, Y = np.meshgrid(strikes, maturities)\n", - " Z = implied_vols.T # Transpose to match the meshgrid orientation\n", - "\n", - " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", - " \n", - " # Update the layout\n", - " fig.update_layout(\n", - " title=f'3D Surface Plot of Implied Volatility',\n", - " scene=dict(\n", - " xaxis_title='Strike',\n", - " yaxis_title='Maturity',\n", - " zaxis_title='Implied Volatility',\n", - " camera=dict(\n", - " up=dict(x=0, y=0, z=1),\n", - " center=dict(x=0, y=0, z=0),\n", - " eye=dict(x=1.5, y=1.5, z=1.5)\n", - " )\n", - " ),\n", - " width=900,\n", - " height=700,\n", - " margin=dict(l=65, r=50, b=65, t=90)\n", - " )\n", - "\n", - " return fig" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "66ca002a", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.ImpliedVolSurface\",\n", - " params={\n", - " \"market_prices\": sv_market_prices,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " \"S0\": S0,\n", - " \"r\": r,\n", - " \"barrier\": 120\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "a49d8a1e", - "metadata": {}, - "source": [ - "<a id='toc4_6_2__'></a>\n", - "\n", - "#### Sensitivity Testing" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "784a5e7c", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "@vm.test(\"my_custom_tests.Sensitivity\")\n", - "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", - " \"\"\"\n", - " This is sensitivity test\n", - "\"\"\"\n", - " if model_type == 'BS':\n", - " model = BlackScholesModel(S0, strike, T, r, sigma)\n", - " else:\n", - " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", - " price = knockout_option.price_knockout_option(N, M)\n", - "\n", - " return pd.DataFrame({\"Option price\": [price]})" - ] - }, - { - "cell_type": "markdown", - "id": "d4be30e6", - "metadata": {}, - "source": [ - "##### Initialise parameters" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "46878b84", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "strike_range = (min(strikes), max(strikes))\n", - "barrier_range = (100, 120)" - ] - }, - { - "cell_type": "markdown", - "id": "205c46ce", - "metadata": {}, - "source": [ - "##### Common plot function\n", - "Let's create a line plot using the default result output data and log it by passing the function through the `post_process_fn` parameter in the `run_test()` method." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d4b9ea2f", - "metadata": {}, - "outputs": [], - "source": [ - "from plotly.express import bar\n", - "from validmind.vm_models.figure import Figure\n", - "from validmind.vm_models.result import TestResult\n", - "import plotly.graph_objects as go\n", - "import random\n", - "\n", - "def process_results(result: TestResult):\n", - "\n", - " # Convert to DataFrame\n", - " df = pd.DataFrame(result.tables[0].data)\n", - " \n", - " # Get the first two column names\n", - " x_col = df.columns[0]\n", - " y_col = df.columns[1]\n", - " \n", - " # Create figure\n", - " fig = go.Figure()\n", - " fig.add_trace(\n", - " go.Scatter(\n", - " x=df[x_col],\n", - " y=df[y_col],\n", - " mode='lines',\n", - " name=y_col # Use y-axis column name as trace name\n", - " )\n", - " )\n", - " \n", - " fig.update_layout(\n", - " xaxis_title=x_col,\n", - " yaxis_title=y_col,\n", - " showlegend=True,\n", - " template=\"plotly_white\"\n", - " )\n", - "\n", - " result.add_figure(\n", - " Figure(\n", - " figure=fig,\n", - " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", - " ref_id=result.ref_id,\n", - " )\n", - " )\n", - "\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "id": "528b409c", - "metadata": {}, - "source": [ - "##### Strike sensitivity Test\n", - "Let's evaluates the sensitivity of a model's output value to changes in the strike price, while keeping other parameters constant.\n", - "This test is crucial for understanding how variations in strike prices affect the valuation of financial derivatives, particularly options." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb8f1cab", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:S0\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn= process_results\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e566a681", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:ToStrike\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": list(np.linspace(strike_range[0], strike_range[1], 20)),\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn= process_results\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0f288663", - "metadata": {}, - "source": [ - "##### Barrier Sensitivity Test\n", - "Let's evaluates the sensitivity of a model's output to changes in the barrier level of a financial derivative, specifically a barrier option. This test is crucial for understanding how small changes in the barrier can impact the option's valuation, which is essential for risk management and pricing strategies." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "95f81283", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:ToBarrier\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": list(np.linspace(barrier_range[0], barrier_range[1], 20)),\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - "\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "3201aa09", - "metadata": {}, - "source": [ - "<a id='toc4_6_3__'></a>\n", - "\n", - "#### Greeks\n", - "These Greeks are crucial for traders and risk managers as they provide insights into the risk and potential price movements of options and derivatives, allowing for more informed decision-making and risk management strategies." - ] - }, - { - "cell_type": "markdown", - "id": "f31afc73", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Delta\n", - "Let's measures the sensitivity of the option's price to a change in the price of the underlying asset. It indicates how much the price of an option is expected to move per $1 change in the underlying asset's price." - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "id": "31befc58", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksDelta\")\n", - "def calculate_delta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate delta using finite difference method.\n", - " Delta = (V(S0 + h) - V(S0 - h)) / (2h)\n", - " where V is the option price and h is a small increment\n", - " \"\"\"\n", - " # Initialize the model with S0 + h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", - " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - "\n", - " # Calculate option prices for up and down moves\n", - " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate delta using central difference\n", - " delta = (price_up - price_down) / (2 * h)\n", - " df = pd.DataFrame({\"Delta\": [delta], \"Price_Up\": [price_up], \"Price_Down\": [price_down], \"h\": [h]})\n", - " return df\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a033dd96", - "metadata": {}, - "outputs": [], - "source": [ - "# To analyze delta sensitivity to underlying price changes\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksDelta\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.001]\n", - " },\n", - "post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0826d4dc", - "metadata": {}, - "source": [ - "<a id='toc4_8__'></a>\n", - "\n", - "### Gamma\n", - "Let's measures the rate of change of Delta with respect to changes in the underlying asset's price. It indicates the curvature of the option's price relative to the underlying asset's price." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ccf54452", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksGamma\")\n", - "def calculate_gamma(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.01): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate gamma using finite difference method.\n", - " Gamma = (V(S0 + h) - 2V(S0) + V(S0 - h)) / h^2\n", - " where V is the option price and h is a small increment\n", - " \"\"\"\n", - " # Initialize the models with S0 + h, S0, and S0 - h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", - " model_center = BlackScholesModel(S0, strike, T, r, sigma)\n", - " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_center = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up, center, and down moves\n", - " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", - " knockout_center = KnockoutOption(model_center, S0, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_center = knockout_center.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate gamma using second-order central difference\n", - " gamma = (price_up - 2*price_center + price_down) / (h * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Gamma\": [gamma], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Center\": [price_center],\n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# To analyze gamma sensitivity to underlying price changes\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksGamma\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.1]\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "df0eaa72", - "metadata": {}, - "source": [ - "<a id='toc4_9__'></a>\n", - "\n", - "### Theta\n", - "Let's measures the sensitivity of the option's price to the passage of time, also known as time decay. It indicates how much the price of an option is expected to decrease as the option approaches its expiration date." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e9810b1", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksTheta\")\n", - "def calculate_theta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " dt=1/365): # dt is typically one day\n", - " \"\"\"\n", - " Calculate theta using finite difference method.\n", - " Theta = (V(t + dt) - V(t)) / dt\n", - " where V is the option price and dt is a small time increment (typically 1 day)\n", - " \"\"\"\n", - " # Initialize the models with T and T + dt\n", - " if model_type == 'BS':\n", - " model_current = BlackScholesModel(S0, strike, T, r, sigma)\n", - " model_future = BlackScholesModel(S0, strike, T + dt, r, sigma)\n", - " else:\n", - " model_current = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_future = StochasticVolatilityModel(S0, strike, T + dt, r, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for current and future time\n", - " knockout_current = KnockoutOption(model_current, S0, strike, T, r, barrier)\n", - " knockout_future = KnockoutOption(model_future, S0, strike, T + dt, r, barrier)\n", - " \n", - " price_current = knockout_current.price_knockout_option(N, M)\n", - " price_future = knockout_future.price_knockout_option(N, M)\n", - " \n", - " # Calculate theta using forward difference\n", - " # Note: We divide by dt and multiply by -1 since theta represents the negative rate of change\n", - " theta_value = -1 * (price_future - price_current) / dt\n", - " \n", - " df = pd.DataFrame({\n", - " \"Theta\": [theta_value], \n", - " \"Price_Current\": [price_current],\n", - " \"Price_Future\": [price_future],\n", - " \"dt\": [dt]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze theta sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksTheta\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"dt\": [1/365] # One day time step\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "28c60e1d", - "metadata": {}, - "source": [ - "<a id='toc4_10__'></a>\n", - "\n", - "### Vega\n", - "Let's measures the sensitivity of the option's price to changes in the volatility of the underlying asset. It indicates how much the price of an option is expected to change with a 1% change in the underlying asset's volatility." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1dbc6632", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksVega\")\n", - "def calculate_vega(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate vega using finite difference method.\n", - " For Black-Scholes: Vega = (V(σ + h) - V(σ - h)) / (2h)\n", - " For Stochastic Vol: Vega = (V(v0 + h) - V(v0 - h)) / (2h)\n", - " where V is the option price and h is a small increment in volatility\n", - " \"\"\"\n", - " if model_type == 'BS':\n", - " # For Black-Scholes, perturb sigma\n", - " model_up = BlackScholesModel(S0, strike, T, r, sigma + h)\n", - " model_down = BlackScholesModel(S0, strike, T, r, sigma - h)\n", - " else:\n", - " # For Stochastic Volatility, perturb v0\n", - " model_up = StochasticVolatilityModel(S0, strike, T, r, v0 + h, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0, strike, T, r, v0 - h, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up and down moves in volatility\n", - " knockout_up = KnockoutOption(model_up, S0, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate vega using central difference\n", - " vega = (price_up - price_down) / (2 * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Vega\": [vega], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze vega sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksVega\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.0001] # Small step size for better accuracy\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "1ec51eba", - "metadata": {}, - "source": [ - "<a id='toc4_11__'></a>\n", - "\n", - "### Rho\n", - "Let's measures the sensitivity of the option's price to changes in the interest rate. It indicates how much the price of an option is expected to change with a 1% change in interest rates." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2f497b5f", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksRho\")\n", - "def calculate_rho(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.0001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate rho using finite difference method.\n", - " Rho = (V(r + h) - V(r - h)) / (2h)\n", - " where V is the option price and h is a small increment in interest rate\n", - " \"\"\"\n", - " # Initialize the models with r + h and r - h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0, strike, T, r + h, sigma)\n", - " model_down = BlackScholesModel(S0, strike, T, r - h, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0, strike, T, r + h, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0, strike, T, r - h, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up and down moves in interest rate\n", - " knockout_up = KnockoutOption(model_up, S0, strike, T, r + h, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0, strike, T, r - h, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate rho using central difference\n", - " rho_value = (price_up - price_down) / (2 * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Rho\": [rho_value], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze rho sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksRho\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.0001] # Small step size for better accuracy\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0cdd1b1b", - "metadata": {}, - "source": [ - "<a id='toc4_11_1__'></a>\n", - "\n", - "#### Stress Testing" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c98ff396", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Stressing\")\n", - "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", - " \"\"\"\n", - " This is stress test\n", - " \"\"\"\n", - " if model_type == 'BS':\n", - " model = BlackScholesModel(S0, strike, T, r, sigma)\n", - " else:\n", - " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", - " price = knockout_option.price_knockout_option(N, M)\n", - "\n", - " return pd.DataFrame({\"Option price\": [price]})" - ] - }, - { - "cell_type": "markdown", - "id": "b6f0a179", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Theta (long term vol) stress test\n", - "First, we create a surface plot to visualize the option price with respect to two variables." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b408de0f", - "metadata": {}, - "outputs": [], - "source": [ - "def two_parameters_stress_surface_plot(result: TestResult):\n", - " import plotly.graph_objects as go\n", - " import numpy as np\n", - " import pandas as pd\n", - " # Convert to DataFrame\n", - " data = pd.DataFrame(result.tables[0].data)\n", - " \n", - " # Get column names (assuming first column is x, next two are y1 and y2)\n", - " z_col = data.columns[2]\n", - " x_col = data.columns[0]\n", - " y_col = data.columns[1]\n", - " \n", - " # Get unique values for x and y\n", - " x_unique = np.sort(data[x_col].unique())\n", - " y_unique = np.sort(data[y_col].unique())\n", - " \n", - " # Create meshgrid\n", - " X, Y = np.meshgrid(x_unique, y_unique)\n", - " \n", - " # Create Z matrix\n", - " Z = np.zeros_like(X)\n", - " for i, x_val in enumerate(x_unique):\n", - " for j, y_val in enumerate(y_unique):\n", - " mask = (data[x_col] == x_val) & (data[y_col] == y_val)\n", - " if mask.any():\n", - " Z[j, i] = data.loc[mask, z_col].iloc[0]\n", - " \n", - " # Create the 3D surface plot\n", - " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", - " \n", - " # Update the layout\n", - " fig.update_layout(\n", - " title=f'3D Surface Plot of {z_col}',\n", - " scene=dict(\n", - " xaxis_title=x_col,\n", - " yaxis_title=y_col,\n", - " zaxis_title=z_col,\n", - " camera=dict(\n", - " up=dict(x=0, y=0, z=1),\n", - " center=dict(x=0, y=0, z=0),\n", - " eye=dict(x=1.5, y=1.5, z=1.5)\n", - " )\n", - " ),\n", - " width=900,\n", - " height=700,\n", - " margin=dict(l=65, r=50, b=65, t=90)\n", - " )\n", - "\n", - " result.add_figure(\n", - " Figure(\n", - " figure=fig,\n", - " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", - " ref_id=result.ref_id,\n", - " )\n", - " )\n", - "\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "id": "87289ee6", - "metadata": {}, - "source": [ - "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework.\n", - "\n", - "This test is useful for understanding how variations in these parameters affect the model's valuation, which is crucial for risk management and model validation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5c0ec52d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": list(np.linspace(0,0.8, 10)),\n", - " \"xi\": [0.1],\n", - " \"rho\": list(np.linspace(-1,0.8, 10)),\n", - " },\n", - " post_process_fn=two_parameters_stress_surface_plot\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "44be4c61", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Xi (vol of vol) stress test" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e0a2996e", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndXiParameters\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": list(np.linspace(0,0.8, 10)),\n", - " \"rho\": list(np.linspace(-1,0.8, 10)),\n", - " },\n", - " post_process_fn=two_parameters_stress_surface_plot\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "5fed568d", - "metadata": {}, - "source": [ - "##### Sigma stress test\n", - "evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options.\n", - "\n", - "This test is useful for risk management and model validation, as it helps identify the robustness of the model under different market conditions. By analyzing the changes in the model's output as sigma varies, stakeholders can assess the model's stability and reliability." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d49e2e37", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['BS'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"sigma\": list(np.linspace(0.2, 0.8, 10)),\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "4e7a1f00", - "metadata": {}, - "source": [ - "##### Stress kappa\n", - "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e995f6ae", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheKappaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": list(np.linspace(0, 8, 10)),\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "40d1c9e2", - "metadata": {}, - "source": [ - "##### Stress theta\n", - "Stress Theta evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7e371aee", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheThetaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": list(np.linspace(0, 0.8, 10)),\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "e20d074f", - "metadata": {}, - "source": [ - "##### Stress xi\n", - "Stress Xi evaluates the sensitivity of a model's output to changes in the parameter xi, which represents the volatility of volatility in a stochastic volatility model. This test is crucial for understanding how variations in xi impact the model's valuation, particularly in financial derivatives pricing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c545090", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheXiParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": list(np.linspace(0.05, 0.95, 10)),\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "f0360e20", - "metadata": {}, - "source": [ - "##### Stress rho\n", - "Stress rho test evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2c5dfb1", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": list(np.linspace(-1.0, 1.0, 20)),\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "61d4e596", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a23adf093a60485ea005cf8fc18545a5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for knockout option pricing model documentation\n", + "\n", + "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", + "\n", + "A knockout option is a barrier option that ceases to exist if the underlying asset hits a predetermined price, known as the \"barrier.\" This barrier level, set above or below the current market price, determines whether the option will \"knock out\" before its expiration date. There are two types: \"up-and-out\" and \"down-and-out.\" In an up-and-out knockout option, the option expires if the asset price rises above the barrier, while in a down-and-out, it expires if the asset price falls below. Knockout options generally offer a lower premium than standard options since there is a chance they will expire worthless if the barrier is reached.\n", + "\n", + "Pricing knockout options involves accounting for the proximity of the asset's price to the barrier, as well as market volatility and the option’s time to expiration. High volatility and longer expiry increase the likelihood of the barrier being triggered, which reduces the option’s value. Models like modified Black-Scholes are used for simpler cases, while Monte Carlo simulations or binomial trees handle complex scenarios. Knockout options are useful for hedging or cost-effective investment strategies, allowing investors to save on premiums but with the risk of losing the option entirely if the barrier is hit.\n", + "\n", + "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." + ], + "id": "87056cee" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Model development](#toc3__) \n", + "- [Data Preparation](#toc4__) \n", + " - [Synthetic data generation](#toc4_1__) \n", + " - [Initialize the ValidMind datasets](#toc4_2__) \n", + " - [Data Quality](#toc4_3__) \n", + " - [Outliers detection using IQR method](#toc4_3_1__) \n", + " - [Isolation Forest Outliers Test](#toc4_3_2__) \n", + " - [Model Calibration](#toc4_4__) \n", + " - [Synthetic Data Calibration Test](#toc4_5__) \n", + " - [Model Evaluation](#toc4_6__) \n", + " - [Benchmark Testing](#toc4_6_1__) \n", + " - [Sensitivity Testing](#toc4_6_2__) \n", + " - [Greeks](#toc4_6_3__) \n", + " - [Delta](#toc4_7__) \n", + " - [Gamma](#toc4_8__) \n", + " - [Theta](#toc4_9__) \n", + " - [Vega](#toc4_10__) \n", + " - [Rho](#toc4_11__) \n", + " - [Stress Testing](#toc4_11_1__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your model documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "7417dfe1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "1426d212" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "f8812717" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ], + "id": "b792f6a9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "c3d26e61" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "f3db6c9b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "e1865b8d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Capital markets`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "214572ff" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "8b9547ad" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0cc9c04c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "e928f7e5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from scipy.optimize import minimize\n", + "\n", + "from validmind.tests import run_test" + ], + "execution_count": null, + "outputs": [], + "id": "9edb42a2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "a2403294" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "3dfd04dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Model development" + ], + "id": "d79d9953" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "class OptionPricing:\n", + " def __init__(self, S0, K, T, r):\n", + " self.S0 = S0\n", + " self.K = K\n", + " self.T = T\n", + " self.r = r\n", + "\n", + " def monte_carlo_simulation(self, N, M):\n", + " raise NotImplementedError(\"Must be implemented by subclasses\")\n", + "\n", + " def price_option(self, N, M):\n", + " raise NotImplementedError(\"Must be implemented by subclasses\")\n" + ], + "execution_count": 32, + "outputs": [], + "id": "c3f5b0b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "class BlackScholesModel(OptionPricing):\n", + " def __init__(self, S0, K, T, r, sigma):\n", + " super().__init__(S0, K, T, r)\n", + " self.sigma = sigma\n", + " def monte_carlo_simulation(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " price_paths[:, 0] = self.S0\n", + " for t in range(1, M + 1):\n", + " Z = np.random.standard_normal(N)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * self.sigma**2) * dt + self.sigma * np.sqrt(dt) * Z)\n", + " return price_paths\n", + "\n", + " def price_option(self, N, M):\n", + " price_paths = self.monte_carlo_simulation(N, M)\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", + " \n", + " def calibrate(self, market_prices, strikes, maturities):\n", + " def objective_function(params):\n", + " self.sigma = params[0]\n", + " for K, T in zip(strikes, maturities):\n", + " self.K = K\n", + " self.T = T\n", + " model_prices.append(self.price_option(10000, 100))\n", + " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", + " result = minimize(objective_function, [self.sigma], bounds=[(0.01, 1.0)])\n", + " self.sigma = result.x[0]\n", + "\n", + "class StochasticVolatilityModel(OptionPricing):\n", + " def __init__(self, S0, K, T, r, v0, kappa, theta, xi, rho):\n", + " super().__init__(S0, K, T, r)\n", + " self.v0 = v0\n", + " self.kappa = kappa\n", + " self.theta = theta\n", + " self.xi = xi\n", + " self.rho = rho\n", + " def monte_carlo_simulation(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " vol_paths = np.zeros((N, M + 1))\n", + " price_paths[:, 0] = self.S0\n", + " vol_paths[:, 0] = self.v0\n", + " for t in range(1, M + 1):\n", + " Z1 = np.random.standard_normal(N)\n", + " Z2 = np.random.standard_normal(N)\n", + " W1 = Z1\n", + " W2 = self.rho * Z1 + np.sqrt(1 - self.rho**2) * Z2\n", + " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.kappa * (self.theta - vol_paths[:, t - 1]) * dt + self.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2)\n", + " return price_paths\n", + "\n", + " def price_option(self, N, M):\n", + " price_paths = self.monte_carlo_simulation(N, M)\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", + " \n", + " def calibrate(self, market_prices, strikes, maturities):\n", + " def objective_function(params):\n", + " self.v0, self.kappa, self.theta, self.xi, self.rho = params\n", + " model_prices = []\n", + " for K, T in zip(strikes, maturities):\n", + " self.K = K\n", + " self.T = T\n", + " model_prices.append(self.price_option(10000, 100))\n", + "\n", + " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", + " \n", + " initial_guess = [self.v0, self.kappa, self.theta, self.xi, self.rho]\n", + " bounds = [(0.01, 1.0), (0.01, 5.0), (0.01, 1.0), (0.01, 1.0), (-1.0, 1.0)]\n", + " result = minimize(objective_function, initial_guess, bounds=bounds)\n", + " self.v0, self.kappa, self.theta, self.xi, self.rho = result.x\n", + "\n", + "\n", + "class KnockoutOption:\n", + " def __init__(self, model, S0, K, T, r, barrier):\n", + " self.model = model\n", + " self.S0 = S0\n", + " self.K = K\n", + " self.T = T\n", + " self.r = r\n", + " self.barrier = barrier\n", + "\n", + " def price_knockout_option(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " vol_paths = np.zeros((N, M + 1)) if isinstance(self.model, StochasticVolatilityModel) else None\n", + " price_paths[:, 0] = self.S0\n", + " if vol_paths is not None:\n", + " vol_paths[:, 0] = self.model.v0\n", + " \n", + " for t in range(1, M + 1):\n", + " Z1 = np.random.standard_normal(N)\n", + " if vol_paths is None:\n", + " # Black-Scholes Model\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", + " (self.r - 0.5 * self.model.sigma**2) * dt + self.model.sigma * np.sqrt(dt) * Z1\n", + " )\n", + " else:\n", + " # Stochastic Volatility Model\n", + " Z2 = np.random.standard_normal(N)\n", + " W1 = Z1\n", + " W2 = self.model.rho * Z1 + np.sqrt(1 - self.model.rho**2) * Z2\n", + " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.model.kappa * (self.model.theta - vol_paths[:, t - 1]) * dt + self.model.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", + " (self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2\n", + " )\n", + " \n", + " # Knockout condition\n", + " price_paths[:, t][price_paths[:, t] >= self.barrier] = 0\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)" + ], + "execution_count": null, + "outputs": [], + "id": "a9d7f832" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Data Preparation" + ], + "id": "14bcdbb9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Synthetic data generation" + ], + "id": "f655dc9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def generate_synthetic_market_data(model, strikes, maturities):\n", + " market_prices = []\n", + " market_data = []\n", + " for K, T in zip(strikes, maturities):\n", + " model.K = K\n", + " model.T = T\n", + " market_prices.append(model.price_option(10000, 100))\n", + " market_data.append({\"strike\": K, \"option_price\": model.price_option(10000, 100)})\n", + " return market_prices, market_data\n" + ], + "execution_count": 34, + "outputs": [], + "id": "42cb9070" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "N = 10000\n", + "M = 100\n", + "\n", + "# Parameters for synthetic data\n", + "S0 = 100\n", + "K = 100\n", + "T = 1\n", + "r = 0.05\n", + "# BlackSholes\n", + "true_sigma = 0.2\n", + "\n", + "# Stochastic Volatility\n", + "true_v0 = 0.2\n", + "true_kappa = 2.0\n", + "true_theta = 0.2\n", + "true_xi = 0.1\n", + "true_rho = -0.5\n", + "\n", + "# Synthetic data generation parameters\n", + "strikes = list(np.linspace(75, 130, 25))\n", + "maturities = list(np.linspace(0.2, 3.0, 25))\n", + "\n", + "# Generate synthetic market data using the true parameters\n", + "bs_model = BlackScholesModel(S0, K, T, r, true_sigma)\n", + "bs_market_prices, bs_market_data = generate_synthetic_market_data(bs_model, strikes, maturities)\n", + "\n", + "\n", + "sv_model = StochasticVolatilityModel(S0, K, T, r, true_v0, true_kappa, true_theta, true_xi, true_rho)\n", + "sv_market_prices, sv_market_data = generate_synthetic_market_data(sv_model, strikes, maturities)\n" + ], + "execution_count": null, + "outputs": [], + "id": "2854fbe3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." + ], + "id": "b54c4950" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "bs_market_data_df = pd.DataFrame(bs_market_data)\n", + "vm_bs_market_data = vm.init_dataset(\n", + " dataset=bs_market_data_df,\n", + " input_id=\"sv_market_data\",\n", + ")\n", + "\n", + "sv_market_data_df = pd.DataFrame(sv_market_data)\n", + "vm_sv_market_data = vm.init_dataset(\n", + " dataset=sv_market_data_df,\n", + " input_id=\"sv_market_data\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "7f3498dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Data Quality\n", + "Let's check quality of the data using outliers and missing data tests." + ], + "id": "7b36b59c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3_1__'></a>\n", + "\n", + "#### Outliers detection using IQR method\n", + "Let's visualizes the distribution of outliers in the option_price feature using the Interquartile Range (IQR) method." + ], + "id": "671330b1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers detection using IQR method for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "f1c1ab6f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersTable:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers table using IQR method for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "6b5e8654" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers detection using IQR method for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "d96f10c7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersTable:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers table using IQR method for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "758c4c57" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3_2__'></a>\n", + "\n", + "#### Isolation Forest Outliers Test\n", + "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." + ], + "id": "b1430200" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "9eb91453" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "12940f8e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Missing Values Test\n", + "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." + ], + "id": "f30e5579" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.MissingValues:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Missing Values detection for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "805ddb1c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "result = run_test(\n", + " \"validmind.data_validation.MissingValues:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"MissingValues detection for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e69e0039" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Model Calibration\n", + "* Clearly state the purpose of the calibration process. For example, in the context of an option pricing model, calibration aims to adjust model parameters to fit market data (e.g., market option prices, volatility surfaces).\n", + "* Specify whether the calibration is to historical data, current market data, or a blend of both." + ], + "id": "09628809" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.SyntheticDataCalibrationTest\")\n", + "def generate_synthetic_data_summary(option_pricing_model, strikes, maturities, synthetic_prices):\n", + " \"\"\"\n", + " This function will use synthetic prices to calibrate each model\n", + " and then generate derived prices based on the calibrated parameters.\n", + " It will output a DataFrame summarizing the strikes, maturities,\n", + " synthetic and derived prices, and the model parameters.\n", + "\n", + " \"\"\"\n", + " derived_prices = []\n", + " for K, T in zip(strikes, maturities):\n", + " option_pricing_model.K = K\n", + " option_pricing_model.T = T\n", + " derived_prices.append(option_pricing_model.price_option(10000, 100))\n", + " \n", + " model_type = type(option_pricing_model).__name__\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturity\": maturities,\n", + " \"Synthetic_Price\": synthetic_prices,\n", + " \"Derived_Price\": derived_prices,\n", + " \"Model_Type\": model_type,\n", + " \"S0\": [option_pricing_model.S0] * len(strikes),\n", + " \"K\": [option_pricing_model.K] * len(strikes),\n", + " \"T\": [option_pricing_model.T] * len(strikes),\n", + " \"r\": [option_pricing_model.r] * len(strikes)\n", + " }\n", + " \n", + " if model_type == \"BlackScholesModel\":\n", + " data[\"sigma\"] = [option_pricing_model.sigma] * len(strikes)\n", + " elif model_type == \"StochasticVolatilityModel\":\n", + " data[\"v0\"] = [option_pricing_model.v0] * len(strikes)\n", + " data[\"kappa\"] = [option_pricing_model.kappa] * len(strikes)\n", + " data[\"theta\"] = [option_pricing_model.theta] * len(strikes)\n", + " data[\"xi\"] = [option_pricing_model.xi] * len(strikes)\n", + " data[\"rho\"] = [option_pricing_model.rho] * len(strikes)\n", + " \n", + " df = pd.DataFrame(data)\n", + " return df\n" + ], + "execution_count": null, + "outputs": [], + "id": "6802c26e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Synthetic Data Calibration Test\n", + "Let's evaluates the accuracy of a stochastic volatility model by comparing synthetic prices with derived prices after model calibration." + ], + "id": "3bf04d21" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.SyntheticDataCalibrationTest\",\n", + " params={\n", + " \"option_pricing_model\": sv_model,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " \"synthetic_prices\": sv_market_prices\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "4345cb5c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Model Evaluation" + ], + "id": "4d48f107" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_1__'></a>\n", + "\n", + "#### Benchmark Testing\n", + "* Compare the model’s performance with alternative models or industry-standard models to assess its relative effectiveness.\n", + "* Ensure that the model is competitive in pricing, accuracy, and computational efficiency." + ], + "id": "8ec8b5a3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", + "def benchmark_test(bs_model, sv_model, strikes, maturities):\n", + " \"\"\"\n", + " Comparison between Black Scholes and stochastic volatility model\n", + "\n", + " \"\"\"\n", + " bs_model_type = type(bs_model).__name__\n", + " sv_model_type = type(sv_model).__name__\n", + "\n", + " bs_derived_prices = []\n", + " sv_derived_prices = []\n", + " for K in strikes:\n", + " bs_model.K = K\n", + " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", + " sv_model.K = K\n", + " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", + "\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturities\": [sv_model.T] * len(strikes),\n", + " \"bs_model_price\": bs_derived_prices,\n", + " \"sv_model_price\": sv_derived_prices,\n", + "\n", + " }\n", + " df1 = pd.DataFrame(data)\n", + "\n", + " bs_derived_prices = []\n", + " sv_derived_prices = []\n", + " for T in maturities:\n", + " bs_model.T = T\n", + " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", + " sv_model.T = T\n", + " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", + "\n", + " data = {\n", + " \"Strike\": [sv_model.K] * len(maturities),\n", + " \"Maturities\": maturities,\n", + " \"bs_model_price\": bs_derived_prices,\n", + " \"sv_model_price\": sv_derived_prices,\n", + " }\n", + "\n", + " df2 = pd.DataFrame(data)\n", + "\n", + " return {\"strikes variation benchmarking\": df1}, {\"maturities variation benchmarking\": df2}" + ], + "execution_count": 47, + "outputs": [], + "id": "ac733262" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.BenchmarkTest\",\n", + " params={\n", + " \"sv_model\": sv_model,\n", + " \"bs_model\": bs_model,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "20de9858" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Surface Volatility Test\n", + "Let's calculates the implied volatility across different strikes and maturities based on market prices" + ], + "id": "d9ad15b8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "from scipy.optimize import minimize\n", + "import plotly.graph_objects as go\n", + "\n", + "@vm.test(\"my_custom_tests.ImpliedVolSurface\")\n", + "def implied_vol_surface(market_prices, strikes, maturities, S0, r, barrier, N=10000, M=100):\n", + " \"\"\"\n", + " This is a test to compute the implied volatility surface for a given set of market prices,\n", + " strikes, and maturities.\n", + " \"\"\"\n", + " def implied_volatility(market_price, N, M, initial_guess=0.2):\n", + " def objective_function(sigma):\n", + " model.sigma = sigma\n", + " model_price = model.price_option(N, M)\n", + " return (model_price - market_price) ** 2\n", + "\n", + " result = minimize(objective_function, initial_guess, bounds=[(0.01, 1.0)])\n", + " return result.x[0]\n", + " \n", + " implied_vols = np.zeros((len(strikes), len(maturities)))\n", + "\n", + " for i, K in enumerate(strikes):\n", + " for j, T in enumerate(maturities):\n", + " market_price = market_prices[i]\n", + " model = BlackScholesModel(S0, K, T, r, sigma=0.2)\n", + "\n", + " implied_vol = implied_volatility(market_price, N, M)\n", + " implied_vols[i, j] = implied_vol\n", + "\n", + " # Create the 3D surface plot\n", + " X, Y = np.meshgrid(strikes, maturities)\n", + " Z = implied_vols.T # Transpose to match the meshgrid orientation\n", + "\n", + " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", + " \n", + " # Update the layout\n", + " fig.update_layout(\n", + " title=f'3D Surface Plot of Implied Volatility',\n", + " scene=dict(\n", + " xaxis_title='Strike',\n", + " yaxis_title='Maturity',\n", + " zaxis_title='Implied Volatility',\n", + " camera=dict(\n", + " up=dict(x=0, y=0, z=1),\n", + " center=dict(x=0, y=0, z=0),\n", + " eye=dict(x=1.5, y=1.5, z=1.5)\n", + " )\n", + " ),\n", + " width=900,\n", + " height=700,\n", + " margin=dict(l=65, r=50, b=65, t=90)\n", + " )\n", + "\n", + " return fig" + ], + "execution_count": 49, + "outputs": [], + "id": "46e275e3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.ImpliedVolSurface\",\n", + " params={\n", + " \"market_prices\": sv_market_prices,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " \"S0\": S0,\n", + " \"r\": r,\n", + " \"barrier\": 120\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "66ca002a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_2__'></a>\n", + "\n", + "#### Sensitivity Testing" + ], + "id": "a49d8a1e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "@vm.test(\"my_custom_tests.Sensitivity\")\n", + "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", + " \"\"\"\n", + " This is sensitivity test\n", + "\"\"\"\n", + " if model_type == 'BS':\n", + " model = BlackScholesModel(S0, strike, T, r, sigma)\n", + " else:\n", + " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", + " price = knockout_option.price_knockout_option(N, M)\n", + "\n", + " return pd.DataFrame({\"Option price\": [price]})" + ], + "execution_count": null, + "outputs": [], + "id": "784a5e7c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Initialise parameters" + ], + "id": "d4be30e6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "strike_range = (min(strikes), max(strikes))\n", + "barrier_range = (100, 120)" + ], + "execution_count": null, + "outputs": [], + "id": "46878b84" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Common plot function\n", + "Let's create a line plot using the default result output data and log it by passing the function through the `post_process_fn` parameter in the `run_test()` method." + ], + "id": "205c46ce" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from plotly.express import bar\n", + "from validmind.vm_models.figure import Figure\n", + "from validmind.vm_models.result import TestResult\n", + "import plotly.graph_objects as go\n", + "import random\n", + "\n", + "def process_results(result: TestResult):\n", + "\n", + " # Convert to DataFrame\n", + " df = pd.DataFrame(result.tables[0].data)\n", + " \n", + " # Get the first two column names\n", + " x_col = df.columns[0]\n", + " y_col = df.columns[1]\n", + " \n", + " # Create figure\n", + " fig = go.Figure()\n", + " fig.add_trace(\n", + " go.Scatter(\n", + " x=df[x_col],\n", + " y=df[y_col],\n", + " mode='lines',\n", + " name=y_col # Use y-axis column name as trace name\n", + " )\n", + " )\n", + " \n", + " fig.update_layout(\n", + " xaxis_title=x_col,\n", + " yaxis_title=y_col,\n", + " showlegend=True,\n", + " template=\"plotly_white\"\n", + " )\n", + "\n", + " result.add_figure(\n", + " Figure(\n", + " figure=fig,\n", + " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", + " ref_id=result.ref_id,\n", + " )\n", + " )\n", + "\n", + " return result" + ], + "execution_count": null, + "outputs": [], + "id": "d4b9ea2f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Strike sensitivity Test\n", + "Let's evaluates the sensitivity of a model's output value to changes in the strike price, while keeping other parameters constant.\n", + "This test is crucial for understanding how variations in strike prices affect the valuation of financial derivatives, particularly options." + ], + "id": "528b409c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:S0\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn= process_results\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "bb8f1cab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:ToStrike\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": list(np.linspace(strike_range[0], strike_range[1], 20)),\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn= process_results\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e566a681" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Barrier Sensitivity Test\n", + "Let's evaluates the sensitivity of a model's output to changes in the barrier level of a financial derivative, specifically a barrier option. This test is crucial for understanding how small changes in the barrier can impact the option's valuation, which is essential for risk management and pricing strategies." + ], + "id": "0f288663" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:ToBarrier\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": list(np.linspace(barrier_range[0], barrier_range[1], 20)),\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + "\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "95f81283" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_3__'></a>\n", + "\n", + "#### Greeks\n", + "These Greeks are crucial for traders and risk managers as they provide insights into the risk and potential price movements of options and derivatives, allowing for more informed decision-making and risk management strategies." + ], + "id": "3201aa09" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Delta\n", + "Let's measures the sensitivity of the option's price to a change in the price of the underlying asset. It indicates how much the price of an option is expected to move per $1 change in the underlying asset's price." + ], + "id": "f31afc73" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksDelta\")\n", + "def calculate_delta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate delta using finite difference method.\n", + " Delta = (V(S0 + h) - V(S0 - h)) / (2h)\n", + " where V is the option price and h is a small increment\n", + " \"\"\"\n", + " # Initialize the model with S0 + h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", + " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + "\n", + " # Calculate option prices for up and down moves\n", + " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate delta using central difference\n", + " delta = (price_up - price_down) / (2 * h)\n", + " df = pd.DataFrame({\"Delta\": [delta], \"Price_Up\": [price_up], \"Price_Down\": [price_down], \"h\": [h]})\n", + " return df\n", + "\n" + ], + "execution_count": 30, + "outputs": [], + "id": "31befc58" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# To analyze delta sensitivity to underlying price changes\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksDelta\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.001]\n", + " },\n", + "post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "a033dd96" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_8__'></a>\n", + "\n", + "### Gamma\n", + "Let's measures the rate of change of Delta with respect to changes in the underlying asset's price. It indicates the curvature of the option's price relative to the underlying asset's price." + ], + "id": "0826d4dc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksGamma\")\n", + "def calculate_gamma(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.01): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate gamma using finite difference method.\n", + " Gamma = (V(S0 + h) - 2V(S0) + V(S0 - h)) / h^2\n", + " where V is the option price and h is a small increment\n", + " \"\"\"\n", + " # Initialize the models with S0 + h, S0, and S0 - h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", + " model_center = BlackScholesModel(S0, strike, T, r, sigma)\n", + " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_center = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up, center, and down moves\n", + " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", + " knockout_center = KnockoutOption(model_center, S0, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_center = knockout_center.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate gamma using second-order central difference\n", + " gamma = (price_up - 2*price_center + price_down) / (h * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Gamma\": [gamma], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Center\": [price_center],\n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# To analyze gamma sensitivity to underlying price changes\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksGamma\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.1]\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "ccf54452" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_9__'></a>\n", + "\n", + "### Theta\n", + "Let's measures the sensitivity of the option's price to the passage of time, also known as time decay. It indicates how much the price of an option is expected to decrease as the option approaches its expiration date." + ], + "id": "df0eaa72" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksTheta\")\n", + "def calculate_theta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " dt=1/365): # dt is typically one day\n", + " \"\"\"\n", + " Calculate theta using finite difference method.\n", + " Theta = (V(t + dt) - V(t)) / dt\n", + " where V is the option price and dt is a small time increment (typically 1 day)\n", + " \"\"\"\n", + " # Initialize the models with T and T + dt\n", + " if model_type == 'BS':\n", + " model_current = BlackScholesModel(S0, strike, T, r, sigma)\n", + " model_future = BlackScholesModel(S0, strike, T + dt, r, sigma)\n", + " else:\n", + " model_current = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_future = StochasticVolatilityModel(S0, strike, T + dt, r, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for current and future time\n", + " knockout_current = KnockoutOption(model_current, S0, strike, T, r, barrier)\n", + " knockout_future = KnockoutOption(model_future, S0, strike, T + dt, r, barrier)\n", + " \n", + " price_current = knockout_current.price_knockout_option(N, M)\n", + " price_future = knockout_future.price_knockout_option(N, M)\n", + " \n", + " # Calculate theta using forward difference\n", + " # Note: We divide by dt and multiply by -1 since theta represents the negative rate of change\n", + " theta_value = -1 * (price_future - price_current) / dt\n", + " \n", + " df = pd.DataFrame({\n", + " \"Theta\": [theta_value], \n", + " \"Price_Current\": [price_current],\n", + " \"Price_Future\": [price_future],\n", + " \"dt\": [dt]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze theta sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksTheta\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"dt\": [1/365] # One day time step\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "0e9810b1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_10__'></a>\n", + "\n", + "### Vega\n", + "Let's measures the sensitivity of the option's price to changes in the volatility of the underlying asset. It indicates how much the price of an option is expected to change with a 1% change in the underlying asset's volatility." + ], + "id": "28c60e1d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksVega\")\n", + "def calculate_vega(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate vega using finite difference method.\n", + " For Black-Scholes: Vega = (V(σ + h) - V(σ - h)) / (2h)\n", + " For Stochastic Vol: Vega = (V(v0 + h) - V(v0 - h)) / (2h)\n", + " where V is the option price and h is a small increment in volatility\n", + " \"\"\"\n", + " if model_type == 'BS':\n", + " # For Black-Scholes, perturb sigma\n", + " model_up = BlackScholesModel(S0, strike, T, r, sigma + h)\n", + " model_down = BlackScholesModel(S0, strike, T, r, sigma - h)\n", + " else:\n", + " # For Stochastic Volatility, perturb v0\n", + " model_up = StochasticVolatilityModel(S0, strike, T, r, v0 + h, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0, strike, T, r, v0 - h, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up and down moves in volatility\n", + " knockout_up = KnockoutOption(model_up, S0, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate vega using central difference\n", + " vega = (price_up - price_down) / (2 * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Vega\": [vega], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze vega sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksVega\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.0001] # Small step size for better accuracy\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "1dbc6632" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11__'></a>\n", + "\n", + "### Rho\n", + "Let's measures the sensitivity of the option's price to changes in the interest rate. It indicates how much the price of an option is expected to change with a 1% change in interest rates." + ], + "id": "1ec51eba" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksRho\")\n", + "def calculate_rho(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.0001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate rho using finite difference method.\n", + " Rho = (V(r + h) - V(r - h)) / (2h)\n", + " where V is the option price and h is a small increment in interest rate\n", + " \"\"\"\n", + " # Initialize the models with r + h and r - h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0, strike, T, r + h, sigma)\n", + " model_down = BlackScholesModel(S0, strike, T, r - h, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0, strike, T, r + h, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0, strike, T, r - h, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up and down moves in interest rate\n", + " knockout_up = KnockoutOption(model_up, S0, strike, T, r + h, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0, strike, T, r - h, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate rho using central difference\n", + " rho_value = (price_up - price_down) / (2 * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Rho\": [rho_value], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze rho sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksRho\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.0001] # Small step size for better accuracy\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "2f497b5f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11_1__'></a>\n", + "\n", + "#### Stress Testing" + ], + "id": "0cdd1b1b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Stressing\")\n", + "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", + " \"\"\"\n", + " This is stress test\n", + " \"\"\"\n", + " if model_type == 'BS':\n", + " model = BlackScholesModel(S0, strike, T, r, sigma)\n", + " else:\n", + " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", + " price = knockout_option.price_knockout_option(N, M)\n", + "\n", + " return pd.DataFrame({\"Option price\": [price]})" + ], + "execution_count": null, + "outputs": [], + "id": "c98ff396" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Theta (long term vol) stress test\n", + "First, we create a surface plot to visualize the option price with respect to two variables." + ], + "id": "b6f0a179" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def two_parameters_stress_surface_plot(result: TestResult):\n", + " import plotly.graph_objects as go\n", + " import numpy as np\n", + " import pandas as pd\n", + " # Convert to DataFrame\n", + " data = pd.DataFrame(result.tables[0].data)\n", + " \n", + " # Get column names (assuming first column is x, next two are y1 and y2)\n", + " z_col = data.columns[2]\n", + " x_col = data.columns[0]\n", + " y_col = data.columns[1]\n", + " \n", + " # Get unique values for x and y\n", + " x_unique = np.sort(data[x_col].unique())\n", + " y_unique = np.sort(data[y_col].unique())\n", + " \n", + " # Create meshgrid\n", + " X, Y = np.meshgrid(x_unique, y_unique)\n", + " \n", + " # Create Z matrix\n", + " Z = np.zeros_like(X)\n", + " for i, x_val in enumerate(x_unique):\n", + " for j, y_val in enumerate(y_unique):\n", + " mask = (data[x_col] == x_val) & (data[y_col] == y_val)\n", + " if mask.any():\n", + " Z[j, i] = data.loc[mask, z_col].iloc[0]\n", + " \n", + " # Create the 3D surface plot\n", + " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", + " \n", + " # Update the layout\n", + " fig.update_layout(\n", + " title=f'3D Surface Plot of {z_col}',\n", + " scene=dict(\n", + " xaxis_title=x_col,\n", + " yaxis_title=y_col,\n", + " zaxis_title=z_col,\n", + " camera=dict(\n", + " up=dict(x=0, y=0, z=1),\n", + " center=dict(x=0, y=0, z=0),\n", + " eye=dict(x=1.5, y=1.5, z=1.5)\n", + " )\n", + " ),\n", + " width=900,\n", + " height=700,\n", + " margin=dict(l=65, r=50, b=65, t=90)\n", + " )\n", + "\n", + " result.add_figure(\n", + " Figure(\n", + " figure=fig,\n", + " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", + " ref_id=result.ref_id,\n", + " )\n", + " )\n", + "\n", + " return result" + ], + "execution_count": null, + "outputs": [], + "id": "b408de0f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework.\n", + "\n", + "This test is useful for understanding how variations in these parameters affect the model's valuation, which is crucial for risk management and model validation." + ], + "id": "87289ee6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": list(np.linspace(0,0.8, 10)),\n", + " \"xi\": [0.1],\n", + " \"rho\": list(np.linspace(-1,0.8, 10)),\n", + " },\n", + " post_process_fn=two_parameters_stress_surface_plot\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "5c0ec52d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Xi (vol of vol) stress test" + ], + "id": "44be4c61" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndXiParameters\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": list(np.linspace(0,0.8, 10)),\n", + " \"rho\": list(np.linspace(-1,0.8, 10)),\n", + " },\n", + " post_process_fn=two_parameters_stress_surface_plot\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e0a2996e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Sigma stress test\n", + "evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options.\n", + "\n", + "This test is useful for risk management and model validation, as it helps identify the robustness of the model under different market conditions. By analyzing the changes in the model's output as sigma varies, stakeholders can assess the model's stability and reliability." + ], + "id": "5fed568d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['BS'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"sigma\": list(np.linspace(0.2, 0.8, 10)),\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "d49e2e37" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress kappa\n", + "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." + ], + "id": "4e7a1f00" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheKappaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": list(np.linspace(0, 8, 10)),\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e995f6ae" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress theta\n", + "Stress Theta evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model" + ], + "id": "40d1c9e2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheThetaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": list(np.linspace(0, 0.8, 10)),\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "7e371aee" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress xi\n", + "Stress Xi evaluates the sensitivity of a model's output to changes in the parameter xi, which represents the volatility of volatility in a stochastic volatility model. This test is crucial for understanding how variations in xi impact the model's valuation, particularly in financial derivatives pricing." + ], + "id": "e20d074f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheXiParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": list(np.linspace(0.05, 0.95, 10)),\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "9c545090" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress rho\n", + "Stress rho test evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." + ], + "id": "f0360e20" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": list(np.linspace(-1.0, 1.0, 20)),\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e2c5dfb1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "61d4e596" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a23adf093a60485ea005cf8fc18545a5" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index e0d7c1a11..2057d819c 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -1,1354 +1,1360 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "1e2a4689", - "metadata": {}, - "source": [ - "# Quickstart for Heston option pricing model using QuantLib\n", - "\n", - "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", - "\n", - "The Heston option pricing model is a popular stochastic volatility model used to price options. Developed by Steven Heston in 1993, the model assumes that the asset's volatility follows a mean-reverting square-root process, allowing it to capture the empirical observation of volatility \"clustering\" in financial markets. This model is particularly useful for assets where volatility is not constant, making it a favored approach in quantitative finance for pricing complex derivatives.\n", - "\n", - "Here’s an overview of the Heston model as implemented in QuantLib, a powerful library for quantitative finance:\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Model Assumptions and Characteristics\n", - "1. **Stochastic Volatility**: The volatility is modeled as a stochastic process, following a mean-reverting square-root process (Cox-Ingersoll-Ross process).\n", - "2. **Correlated Asset and Volatility Processes**: The asset price and volatility are assumed to be correlated, allowing the model to capture the \"smile\" effect observed in implied volatilities.\n", - "3. **Risk-Neutral Dynamics**: The Heston model is typically calibrated under a risk-neutral measure, which allows for direct application to pricing.\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### Heston Model Parameters\n", - "The model is governed by a set of key parameters:\n", - "- **S0**: Initial stock price\n", - "- **v0**: Initial variance of the asset price\n", - "- **kappa**: Speed of mean reversion of the variance\n", - "- **theta**: Long-term mean level of variance\n", - "- **sigma**: Volatility of volatility (vol of vol)\n", - "- **rho**: Correlation between the asset price and variance processes\n", - "\n", - "The dynamics of the asset price \\( S \\) and the variance \\( v \\) under the Heston model are given by:\n", - "\n", - "$$\n", - "dS_t = r S_t \\, dt + \\sqrt{v_t} S_t \\, dW^S_t\n", - "$$\n", - "\n", - "$$\n", - "dv_t = \\kappa (\\theta - v_t) \\, dt + \\sigma \\sqrt{v_t} \\, dW^v_t\n", - "$$\n", - "\n", - "where \\( $dW^S$ \\) and \\( $dW^v$ \\) are Wiener processes with correlation \\( $\\rho$ \\).\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Advantages and Limitations\n", - "- **Advantages**:\n", - " - Ability to capture volatility smiles and skews.\n", - " - More realistic pricing for options on assets with stochastic volatility.\n", - "- **Limitations**:\n", - " - Calibration can be complex due to the number of parameters.\n", - " - Computationally intensive compared to simpler models like Black-Scholes.\n", - "\n", - "This setup provides a robust framework for pricing and analyzing options with stochastic volatility dynamics. QuantLib’s implementation makes it easy to experiment with different parameter configurations and observe their effects on pricing.\n", - "\n", - "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." - ] - }, - { - "cell_type": "markdown", - "id": "69ec219a", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - " - [Model Assumptions and Characteristics](#toc1_1__) \n", - " - [Heston Model Parameters](#toc1_2__) \n", - " - [Advantages and Limitations](#toc1_3__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Initialize the Python environment](#toc3_3__) \n", - " - [Preview the documentation template](#toc3_4__) \n", - "- [Data Preparation](#toc4__) \n", - " - [Helper functions](#toc4_1_1__) \n", - " - [Market Data Quality and Availability](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Data Quality](#toc4_4__) \n", - " - [Isolation Forest Outliers Test](#toc4_4_1__) \n", - " - [Model parameters](#toc4_4_2__) \n", - "- [Model development - Heston Option price](#toc5__) \n", - " - [Model Calibration](#toc5_1__) \n", - " - [Model Evaluation](#toc5_2__) \n", - " - [Benchmark Testing](#toc5_2_1__) \n", - " - [Sensitivity Testing](#toc5_2_2__) \n", - " - [Stress Testing](#toc5_2_3__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your model documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "b9fb5d17", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "f2dccf35", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "5a5ce085", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "409352bf", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "65e870b2", - "metadata": {}, - "source": [ - "To install the QuantLib library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3a34debf", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q QuantLib" - ] - }, - { - "cell_type": "markdown", - "id": "fb30ae07", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "c6f87017", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "cbb2e2c9", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Capital Markets`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "41c4edca", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"capital_markets_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "2012eb82", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0cd3f67e", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "id": "6d944cc9", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f8cf2746", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import pandas as pd\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from scipy.optimize import minimize\n", - "import yfinance as yf\n", - "import QuantLib as ql\n", - "from validmind.tests import run_test" - ] - }, - { - "cell_type": "markdown", - "id": "bc431ee0", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7e844028", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "0c0ee8b9", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Data Preparation" - ] - }, - { - "cell_type": "markdown", - "id": "5a4d2c36", - "metadata": {}, - "source": [ - "### Market Data Sources\n", - "\n", - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### Helper functions\n", - "Let's define helper function retrieve to option data from Yahoo Finance." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b96a500f", - "metadata": {}, - "outputs": [], - "source": [ - "def get_market_data(ticker, expiration_date_str):\n", - " \"\"\"\n", - " Fetch option market data from Yahoo Finance for the given ticker and expiration date.\n", - " Returns a list of tuples: (strike, maturity, option_price).\n", - " \"\"\"\n", - " # Create a Ticker object for the specified stock\n", - " stock = yf.Ticker(ticker)\n", - "\n", - " # Get all available expiration dates for options\n", - " option_dates = stock.options\n", - "\n", - " # Check if the requested expiration date is available\n", - " if expiration_date_str not in option_dates:\n", - " raise ValueError(f\"Expiration date {expiration_date_str} not available for {ticker}. Available dates: {option_dates}\")\n", - "\n", - " # Get the option chain for the specified expiration date\n", - " option_chain = stock.option_chain(expiration_date_str)\n", - "\n", - " # Get call options (or you can use puts as well based on your requirement)\n", - " calls = option_chain.calls\n", - "\n", - " # Convert expiration_date_str to QuantLib Date\n", - " expiry_date_parts = list(map(int, expiration_date_str.split('-'))) # Split YYYY-MM-DD\n", - " maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0]) # Convert to QuantLib Date\n", - "\n", - " # Create a list to store strike prices, maturity dates, and option prices\n", - " market_data = []\n", - " for index, row in calls.iterrows():\n", - " strike = row['strike']\n", - " option_price = row['lastPrice'] # You can also use 'bid', 'ask', 'mid', etc.\n", - " market_data.append((strike, maturity_date, option_price))\n", - " df = pd.DataFrame(market_data, columns = ['strike', 'maturity_date', 'option_price'])\n", - " return df" - ] - }, - { - "cell_type": "markdown", - "id": "c7769b73", - "metadata": {}, - "source": [ - "Let's define helper function retrieve to stock data from Yahoo Finance. This helper function to calculate spot price, dividend yield, volatility and risk free rate using the underline stock data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dc44c448", - "metadata": {}, - "outputs": [], - "source": [ - "def get_option_parameters(ticker):\n", - " # Fetch historical data for the stock\n", - " stock_data = yf.Ticker(ticker)\n", - " \n", - " # Get the current spot price\n", - " spot_price = stock_data.history(period=\"1d\")['Close'].iloc[-1]\n", - " \n", - " # Get dividend yield\n", - " dividend_rate = stock_data.dividends.mean() / spot_price if not stock_data.dividends.empty else 0.0\n", - " \n", - " # Estimate volatility (standard deviation of log returns)\n", - " hist_data = stock_data.history(period=\"1y\")['Close']\n", - " log_returns = np.log(hist_data / hist_data.shift(1)).dropna()\n", - " volatility = np.std(log_returns) * np.sqrt(252) # Annualized volatility\n", - " \n", - " # Assume a risk-free rate from some known data (can be fetched from market data, here we use 0.001)\n", - " risk_free_rate = 0.001\n", - " \n", - " # Return the calculated parameters\n", - " return {\n", - " \"spot_price\": spot_price,\n", - " \"volatility\": volatility,\n", - " \"dividend_rate\": dividend_rate,\n", - " \"risk_free_rate\": risk_free_rate\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "c7b739d3", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Market Data Quality and Availability\n", - "Next, let's specify ticker and expiration date to get market data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "50225fde", - "metadata": {}, - "outputs": [], - "source": [ - "ticker = \"MSFT\"\n", - "expiration_date = \"2024-12-13\" # Example expiration date in 'YYYY-MM-DD' form\n", - "\n", - "market_data = get_market_data(ticker=ticker, expiration_date_str=expiration_date)" - ] - }, - { - "cell_type": "markdown", - "id": "c539b95e", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "113f9c17", - "metadata": {}, - "outputs": [], - "source": [ - "vm_market_data = vm.init_dataset(\n", - " dataset=market_data,\n", - " input_id=\"market_data\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "185beb24", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Data Quality\n", - "Let's check quality of the data using outliers and missing data tests." - ] - }, - { - "cell_type": "markdown", - "id": "7f14464c", - "metadata": {}, - "source": [ - "<a id='toc4_4_1__'></a>\n", - "\n", - "#### Isolation Forest Outliers Test\n", - "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "56c919ec", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers\",\n", - " inputs={\n", - " \"dataset\": vm_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e4d0e5ca", - "metadata": {}, - "source": [ - "##### Missing Values Test\n", - "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e95c825f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.MissingValues\",\n", - " inputs={\n", - " \"dataset\": vm_market_data,\n", - " },\n", - " title=\"Missing Values detection\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "829403a3", - "metadata": {}, - "source": [ - "<a id='toc4_4_2__'></a>\n", - "\n", - "#### Model parameters\n", - "Let's calculate the model parameters using from stock data " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "25936449", - "metadata": {}, - "outputs": [], - "source": [ - "option_params = get_option_parameters(ticker=ticker)" - ] - }, - { - "cell_type": "markdown", - "id": "0a0948b6", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Model development - Heston Option price" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e15b8221", - "metadata": {}, - "outputs": [], - "source": [ - "class HestonModel:\n", - "\n", - " def __init__(self, ticker, expiration_date_str, calculation_date, spot_price, dividend_rate, risk_free_rate):\n", - " self.ticker = ticker\n", - " self.expiration_date_str = expiration_date_str,\n", - " self.calculation_date = calculation_date\n", - " self.spot_price = spot_price\n", - " self.dividend_rate = dividend_rate\n", - " self.risk_free_rate = risk_free_rate\n", - " \n", - " def predict_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " # Set the evaluation date\n", - " ql.Settings.instance().evaluationDate = self.calculation_date\n", - "\n", - " # Construct the European Option\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.EuropeanExercise(maturity_date)\n", - " european_option = ql.VanillaOption(payoff, exercise)\n", - "\n", - " # Yield term structures for risk-free rate and dividend\n", - " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", - " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", - "\n", - " # Initial stock price\n", - " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", - "\n", - " # Heston process parameters\n", - " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", - " hestonModel = ql.HestonModel(heston_process)\n", - "\n", - " # Use the Heston analytic engine\n", - " engine = ql.AnalyticHestonEngine(hestonModel)\n", - " european_option.setPricingEngine(engine)\n", - "\n", - " # Calculate the Heston model price\n", - " h_price = european_option.NPV()\n", - "\n", - " return h_price\n", - "\n", - " def predict_american_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " # Set the evaluation date\n", - " ql.Settings.instance().evaluationDate = self.calculation_date\n", - "\n", - " # Construct the American Option\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", - " american_option = ql.VanillaOption(payoff, exercise)\n", - "\n", - " # Yield term structures for risk-free rate and dividend\n", - " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", - " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", - "\n", - " # Initial stock price\n", - " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", - "\n", - " # Heston process parameters\n", - " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", - " heston_model = ql.HestonModel(heston_process)\n", - "\n", - "\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", - " american_option = ql.VanillaOption(payoff, exercise)\n", - " heston_fd_engine = ql.FdHestonVanillaEngine(heston_model)\n", - " american_option.setPricingEngine(heston_fd_engine)\n", - " option_price = american_option.NPV()\n", - "\n", - " return option_price\n", - "\n", - " def objective_function(self, params, market_data, spot_price, dividend_rate, risk_free_rate):\n", - " v0, theta, kappa, sigma, rho = params\n", - "\n", - " # Sum of squared differences between market prices and model prices\n", - " error = 0.0\n", - " for i, row in market_data.iterrows():\n", - " model_price = self.predict_option_price(row['strike'], row['maturity_date'], spot_price, \n", - " v0, theta, kappa, sigma, rho)\n", - " error += (model_price - row['option_price']) ** 2\n", - " \n", - " return error\n", - "\n", - " def calibrate_model(self, ticker, expiration_date_str):\n", - " # Get the option market data dynamically from Yahoo Finance\n", - " market_data = get_market_data(ticker, expiration_date_str)\n", - "\n", - " # Initial guesses for Heston parameters\n", - " initial_params = [0.04, 0.04, 0.1, 0.1, -0.75]\n", - "\n", - " # Bounds for the parameters to ensure realistic values\n", - " bounds = [(0.0001, 1.0), # v0\n", - " (0.0001, 1.0), # theta\n", - " (0.001, 2.0), # kappa\n", - " (0.001, 1.0), # sigma\n", - " (-0.75, 0.0)] # rho\n", - "\n", - " # Optimize the parameters to minimize the error between model and market prices\n", - " result = minimize(self.objective_function, initial_params, args=(market_data, self.spot_price, self.dividend_rate, self.risk_free_rate),\n", - " bounds=bounds, method='L-BFGS-B')\n", - "\n", - " # Optimized Heston parameters\n", - " v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = result.x\n", - "\n", - " return v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt\n" - ] - }, - { - "cell_type": "markdown", - "id": "a941aa32", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Model Calibration\n", - "* The calibration process aims to optimize the Heston model parameters (v0, theta, kappa, sigma, rho) by minimizing the difference between model-predicted option prices and observed market prices.\n", - "* In this implementation, the model is calibrated to current market data, specifically using option prices from the selected ticker and expiration date.\n", - "\n", - "Let's specify `calculation_date` and `strike_price` as input parameters for the model to verify its functionality and confirm it operates as expected." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1d61dfca", - "metadata": {}, - "outputs": [], - "source": [ - "calculation_date = ql.Date(26, 11, 2024)\n", - "# Convert expiration date string to QuantLib.Date\n", - "expiry_date_parts = list(map(int, expiration_date.split('-')))\n", - "maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0])\n", - "strike_price = 460.0\n", - "\n", - "hm = HestonModel(\n", - " ticker=ticker,\n", - " expiration_date_str= expiration_date,\n", - " calculation_date= calculation_date,\n", - " spot_price= option_params['spot_price'],\n", - " dividend_rate = option_params['dividend_rate'],\n", - " risk_free_rate = option_params['risk_free_rate']\n", - ")\n", - "\n", - "# Let's calibrate model\n", - "v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = hm.calibrate_model(ticker, expiration_date)\n", - "print(f\"Optimized Heston parameters: v0={v0_opt}, theta={theta_opt}, kappa={kappa_opt}, sigma={sigma_opt}, rho={rho_opt}\")\n", - "\n", - "\n", - "# option price\n", - "h_price = hm.predict_option_price(strike_price, maturity_date, option_params['spot_price'], v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt)\n", - "print(\"The Heston model price for the option is:\", h_price)" - ] - }, - { - "cell_type": "markdown", - "id": "75313272", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Model Evaluation" - ] - }, - { - "cell_type": "markdown", - "id": "2e6471ef", - "metadata": {}, - "source": [ - "<a id='toc5_2_1__'></a>\n", - "\n", - "#### Benchmark Testing\n", - "The benchmark testing framework provides a robust way to validate the Heston model implementation and understand the relationships between European and American option prices under stochastic volatility conditions.\n", - "Let's compares European and American option prices using the Heston model." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "id": "810cf887", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", - "def benchmark_test(hm_model, strikes, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " \"\"\"\n", - " Compares European and American option prices using the Heston model.\n", - "\n", - " This test evaluates the price differences between European and American options\n", - " across multiple strike prices while keeping other parameters constant. The comparison\n", - " helps understand the early exercise premium of American options over their European\n", - " counterparts under stochastic volatility conditions.\n", - "\n", - " Args:\n", - " hm_model: HestonModel instance for option pricing calculations\n", - " strikes (list[float]): List of strike prices to test\n", - " maturity_date (ql.Date): Option expiration date in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0 (float, optional): Initial variance. Defaults to None.\n", - " theta (float, optional): Long-term variance. Defaults to None.\n", - " kappa (float, optional): Mean reversion rate. Defaults to None.\n", - " sigma (float, optional): Volatility of variance. Defaults to None.\n", - " rho (float, optional): Correlation between asset and variance. Defaults to None.\n", - "\n", - " Returns:\n", - " dict: Contains a DataFrame with the following columns:\n", - " - Strike: Strike prices tested\n", - " - Maturity date: Expiration date for all options\n", - " - Spot price: Current underlying price\n", - " - european model price: Prices for European options\n", - " - american model price: Prices for American options\n", - "\"\"\"\n", - " american_derived_prices = []\n", - " european_derived_prices = []\n", - " for K in strikes:\n", - " european_derived_prices.append(hm_model.predict_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", - " american_derived_prices.append(hm_model.predict_american_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", - "\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturity date\": [maturity_date] * len(strikes),\n", - " \"Spot price\": [spot_price] * len(strikes),\n", - " \"european model price\": european_derived_prices,\n", - " \"american model price\": american_derived_prices,\n", - "\n", - " }\n", - " df1 = pd.DataFrame(data)\n", - " return {\"strikes variation benchmarking\": df1}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fdd6705", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.BenchmarkTest\",\n", - " params={\n", - " \"hm_model\": hm,\n", - " \"strikes\": [400, 425, 460, 495, 520],\n", - " \"maturity_date\": maturity_date,\n", - " \"spot_price\": option_params['spot_price'],\n", - " \"v0\":v0_opt,\n", - " \"theta\": theta_opt,\n", - " \"kappa\":kappa_opt ,\n", - " \"sigma\": sigma_opt,\n", - " \"rho\":rho_opt\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "e359b503", - "metadata": {}, - "source": [ - "<a id='toc5_2_2__'></a>\n", - "\n", - "#### Sensitivity Testing\n", - "The sensitivity testing framework provides a systematic approach to understanding how the Heston model responds to parameter changes, which is crucial for both model validation and practical application in trading and risk management." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "51922313", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_test_provider.Sensitivity\")\n", - "def SensitivityTest(\n", - " model,\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - "):\n", - " \"\"\"\n", - " Evaluates the sensitivity of American option prices to changes in model parameters.\n", - "\n", - " This test calculates option prices using the Heston model with optimized parameters.\n", - " It's designed to analyze how changes in various model inputs affect the option price,\n", - " which is crucial for understanding model behavior and risk management.\n", - "\n", - " Args:\n", - " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", - " strike_price (float): Strike price of the option\n", - " maturity_date (ql.Date): Expiration date of the option in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0_opt (float): Optimized initial variance parameter\n", - " theta_opt (float): Optimized long-term variance parameter\n", - " kappa_opt (float): Optimized mean reversion rate parameter\n", - " sigma_opt (float): Optimized volatility of variance parameter\n", - " rho_opt (float): Optimized correlation parameter between asset price and variance\n", - " \"\"\"\n", - " price = model.model.predict_american_option_price(\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - " )\n", - "\n", - " return price\n" - ] - }, - { - "cell_type": "markdown", - "id": "408a05ef", - "metadata": {}, - "source": [ - "##### Common plot function" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "104ca6dd", - "metadata": {}, - "outputs": [], - "source": [ - "def plot_results(df, params: dict = None):\n", - " fig2 = plt.figure(figsize=(10, 6))\n", - " plt.plot(df[params[\"x\"]], df[params[\"y\"]], label=params[\"label\"])\n", - " plt.xlabel(params[\"xlabel\"])\n", - " plt.ylabel(params[\"ylabel\"])\n", - " \n", - " plt.title(params[\"title\"])\n", - " plt.legend()\n", - " plt.grid(True)\n", - " plt.show() # close the plot to avoid displaying it" - ] - }, - { - "cell_type": "markdown", - "id": "ca72b9e5", - "metadata": {}, - "source": [ - "Let's create ValidMind model object" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae7093fa", - "metadata": {}, - "outputs": [], - "source": [ - "hm_model = vm.init_model(model=hm, input_id=\"HestonModel\")" - ] - }, - { - "cell_type": "markdown", - "id": "b2141640", - "metadata": {}, - "source": [ - "##### Strike sensitivity\n", - "Let's analyzes how option prices change as the strike price varies. We create a range of strike prices around the current strike (460) and observe the impact on option prices while keeping all other parameters constant." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ea7f1cbe", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_test_provider.Sensitivity:ToStrike\",\n", - " inputs = {\n", - " \"model\": hm_model\n", - " },\n", - " param_grid={\n", - " \"strike_price\": list(np.linspace(460-50, 460+50, 10)),\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\":[rho_opt]\n", - " },\n", - ")\n", - "result.log()\n", - "# Visualize how option prices change with different strike prices\n", - "plot_results(\n", - " pd.DataFrame(result.tables[0].data),\n", - " params={\n", - " \"x\": \"strike_price\",\n", - " \"y\":\"Value\",\n", - " \"label\":\"Strike price\",\n", - " \"xlabel\":\"Strike price\",\n", - " \"ylabel\":\"option price\",\n", - " \"title\":\"Heston option - Strike price Sensitivity\",\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "be143012", - "metadata": {}, - "source": [ - "<a id='toc5_2_3__'></a>\n", - "\n", - "#### Stress Testing\n", - "This stress testing framework provides a comprehensive view of how the Heston model behaves under different market conditions and helps identify potential risks in option pricing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f2f01a40", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Stressing\")\n", - "def StressTest(\n", - " model,\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - "):\n", - " \"\"\"\n", - " Performs stress testing on Heston model parameters to evaluate option price sensitivity.\n", - "\n", - " This test evaluates how the American option price responds to stressed market conditions\n", - " by varying key model parameters. It's designed to:\n", - " 1. Identify potential model vulnerabilities\n", - " 2. Understand price behavior under extreme scenarios\n", - " 3. Support risk management decisions\n", - " 4. Validate model stability across parameter ranges\n", - "\n", - " Args:\n", - " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", - " strike_price (float): Option strike price\n", - " maturity_date (ql.Date): Option expiration date in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0_opt (float): Initial variance parameter under stress testing\n", - " theta_opt (float): Long-term variance parameter under stress testing\n", - " kappa_opt (float): Mean reversion rate parameter under stress testing\n", - " sigma_opt (float): Volatility of variance parameter under stress testing\n", - " rho_opt (float): Correlation parameter under stress testing\n", - " \"\"\"\n", - " price = model.model.predict_american_option_price(\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - " )\n", - "\n", - " return price\n" - ] - }, - { - "cell_type": "markdown", - "id": "31fcbe9c", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Theta (long term vol) stress test\n", - "Next, let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6119b5d9", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.4, 5)),\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\":list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "be39cb3a", - "metadata": {}, - "source": [ - "##### Sigma stress test\n", - "Let's evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0dc189b7", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": list(np.linspace(0.1, sigma_opt+0.6, 5)),\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "173a5294", - "metadata": {}, - "source": [ - "##### Stress kappa\n", - "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dae9714f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheKappaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": list(np.linspace(kappa_opt, kappa_opt+0.2, 5)),\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "b4d1d968", - "metadata": {}, - "source": [ - "##### Stress theta\n", - "Let's evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68df3db", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheThetaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.9, 5)),\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "32e70456", - "metadata": {}, - "source": [ - "##### Stress rho\n", - "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b5ca3fc2", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "892c5347", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-de5d1e182b09403abddabc2850f2dd05", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for Heston option pricing model using QuantLib\n", + "\n", + "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", + "\n", + "The Heston option pricing model is a popular stochastic volatility model used to price options. Developed by Steven Heston in 1993, the model assumes that the asset's volatility follows a mean-reverting square-root process, allowing it to capture the empirical observation of volatility \"clustering\" in financial markets. This model is particularly useful for assets where volatility is not constant, making it a favored approach in quantitative finance for pricing complex derivatives.\n", + "\n", + "Here’s an overview of the Heston model as implemented in QuantLib, a powerful library for quantitative finance:\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Model Assumptions and Characteristics\n", + "1. **Stochastic Volatility**: The volatility is modeled as a stochastic process, following a mean-reverting square-root process (Cox-Ingersoll-Ross process).\n", + "2. **Correlated Asset and Volatility Processes**: The asset price and volatility are assumed to be correlated, allowing the model to capture the \"smile\" effect observed in implied volatilities.\n", + "3. **Risk-Neutral Dynamics**: The Heston model is typically calibrated under a risk-neutral measure, which allows for direct application to pricing.\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### Heston Model Parameters\n", + "The model is governed by a set of key parameters:\n", + "- **S0**: Initial stock price\n", + "- **v0**: Initial variance of the asset price\n", + "- **kappa**: Speed of mean reversion of the variance\n", + "- **theta**: Long-term mean level of variance\n", + "- **sigma**: Volatility of volatility (vol of vol)\n", + "- **rho**: Correlation between the asset price and variance processes\n", + "\n", + "The dynamics of the asset price \\( S \\) and the variance \\( v \\) under the Heston model are given by:\n", + "\n", + "$$\n", + "dS_t = r S_t \\, dt + \\sqrt{v_t} S_t \\, dW^S_t\n", + "$$\n", + "\n", + "$$\n", + "dv_t = \\kappa (\\theta - v_t) \\, dt + \\sigma \\sqrt{v_t} \\, dW^v_t\n", + "$$\n", + "\n", + "where \\( $dW^S$ \\) and \\( $dW^v$ \\) are Wiener processes with correlation \\( $\\rho$ \\).\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Advantages and Limitations\n", + "- **Advantages**:\n", + " - Ability to capture volatility smiles and skews.\n", + " - More realistic pricing for options on assets with stochastic volatility.\n", + "- **Limitations**:\n", + " - Calibration can be complex due to the number of parameters.\n", + " - Computationally intensive compared to simpler models like Black-Scholes.\n", + "\n", + "This setup provides a robust framework for pricing and analyzing options with stochastic volatility dynamics. QuantLib’s implementation makes it easy to experiment with different parameter configurations and observe their effects on pricing.\n", + "\n", + "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." + ], + "id": "1e2a4689" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + " - [Model Assumptions and Characteristics](#toc1_1__) \n", + " - [Heston Model Parameters](#toc1_2__) \n", + " - [Advantages and Limitations](#toc1_3__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Initialize the Python environment](#toc3_3__) \n", + " - [Preview the documentation template](#toc3_4__) \n", + "- [Data Preparation](#toc4__) \n", + " - [Helper functions](#toc4_1_1__) \n", + " - [Market Data Quality and Availability](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Data Quality](#toc4_4__) \n", + " - [Isolation Forest Outliers Test](#toc4_4_1__) \n", + " - [Model parameters](#toc4_4_2__) \n", + "- [Model development - Heston Option price](#toc5__) \n", + " - [Model Calibration](#toc5_1__) \n", + " - [Model Evaluation](#toc5_2__) \n", + " - [Benchmark Testing](#toc5_2_1__) \n", + " - [Sensitivity Testing](#toc5_2_2__) \n", + " - [Stress Testing](#toc5_2_3__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your model documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "69ec219a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "b9fb5d17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "f2dccf35" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ], + "id": "5a5ce085" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "409352bf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To install the QuantLib library:" + ], + "id": "65e870b2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q QuantLib" + ], + "execution_count": null, + "outputs": [], + "id": "3a34debf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "fb30ae07" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "c6f87017" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Capital Markets`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "cbb2e2c9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"capital_markets_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "41c4edca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "2012eb82" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")\n" + ], + "execution_count": null, + "outputs": [], + "id": "0cd3f67e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "6d944cc9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from scipy.optimize import minimize\n", + "import yfinance as yf\n", + "import QuantLib as ql\n", + "from validmind.tests import run_test" + ], + "execution_count": null, + "outputs": [], + "id": "f8cf2746" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "bc431ee0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "7e844028" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Data Preparation" + ], + "id": "0c0ee8b9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Market Data Sources\n", + "\n", + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### Helper functions\n", + "Let's define helper function retrieve to option data from Yahoo Finance." + ], + "id": "5a4d2c36" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def get_market_data(ticker, expiration_date_str):\n", + " \"\"\"\n", + " Fetch option market data from Yahoo Finance for the given ticker and expiration date.\n", + " Returns a list of tuples: (strike, maturity, option_price).\n", + " \"\"\"\n", + " # Create a Ticker object for the specified stock\n", + " stock = yf.Ticker(ticker)\n", + "\n", + " # Get all available expiration dates for options\n", + " option_dates = stock.options\n", + "\n", + " # Check if the requested expiration date is available\n", + " if expiration_date_str not in option_dates:\n", + " raise ValueError(f\"Expiration date {expiration_date_str} not available for {ticker}. Available dates: {option_dates}\")\n", + "\n", + " # Get the option chain for the specified expiration date\n", + " option_chain = stock.option_chain(expiration_date_str)\n", + "\n", + " # Get call options (or you can use puts as well based on your requirement)\n", + " calls = option_chain.calls\n", + "\n", + " # Convert expiration_date_str to QuantLib Date\n", + " expiry_date_parts = list(map(int, expiration_date_str.split('-'))) # Split YYYY-MM-DD\n", + " maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0]) # Convert to QuantLib Date\n", + "\n", + " # Create a list to store strike prices, maturity dates, and option prices\n", + " market_data = []\n", + " for index, row in calls.iterrows():\n", + " strike = row['strike']\n", + " option_price = row['lastPrice'] # You can also use 'bid', 'ask', 'mid', etc.\n", + " market_data.append((strike, maturity_date, option_price))\n", + " df = pd.DataFrame(market_data, columns = ['strike', 'maturity_date', 'option_price'])\n", + " return df" + ], + "execution_count": null, + "outputs": [], + "id": "b96a500f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's define helper function retrieve to stock data from Yahoo Finance. This helper function to calculate spot price, dividend yield, volatility and risk free rate using the underline stock data." + ], + "id": "c7769b73" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def get_option_parameters(ticker):\n", + " # Fetch historical data for the stock\n", + " stock_data = yf.Ticker(ticker)\n", + " \n", + " # Get the current spot price\n", + " spot_price = stock_data.history(period=\"1d\")['Close'].iloc[-1]\n", + " \n", + " # Get dividend yield\n", + " dividend_rate = stock_data.dividends.mean() / spot_price if not stock_data.dividends.empty else 0.0\n", + " \n", + " # Estimate volatility (standard deviation of log returns)\n", + " hist_data = stock_data.history(period=\"1y\")['Close']\n", + " log_returns = np.log(hist_data / hist_data.shift(1)).dropna()\n", + " volatility = np.std(log_returns) * np.sqrt(252) # Annualized volatility\n", + " \n", + " # Assume a risk-free rate from some known data (can be fetched from market data, here we use 0.001)\n", + " risk_free_rate = 0.001\n", + " \n", + " # Return the calculated parameters\n", + " return {\n", + " \"spot_price\": spot_price,\n", + " \"volatility\": volatility,\n", + " \"dividend_rate\": dividend_rate,\n", + " \"risk_free_rate\": risk_free_rate\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "dc44c448" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Market Data Quality and Availability\n", + "Next, let's specify ticker and expiration date to get market data." + ], + "id": "c7b739d3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "ticker = \"MSFT\"\n", + "expiration_date = \"2024-12-13\" # Example expiration date in 'YYYY-MM-DD' form\n", + "\n", + "market_data = get_market_data(ticker=ticker, expiration_date_str=expiration_date)" + ], + "execution_count": null, + "outputs": [], + "id": "50225fde" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." + ], + "id": "c539b95e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_market_data = vm.init_dataset(\n", + " dataset=market_data,\n", + " input_id=\"market_data\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "113f9c17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Data Quality\n", + "Let's check quality of the data using outliers and missing data tests." + ], + "id": "185beb24" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4_1__'></a>\n", + "\n", + "#### Isolation Forest Outliers Test\n", + "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." + ], + "id": "7f14464c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers\",\n", + " inputs={\n", + " \"dataset\": vm_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "56c919ec" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Missing Values Test\n", + "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." + ], + "id": "e4d0e5ca" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.MissingValues\",\n", + " inputs={\n", + " \"dataset\": vm_market_data,\n", + " },\n", + " title=\"Missing Values detection\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "e95c825f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4_2__'></a>\n", + "\n", + "#### Model parameters\n", + "Let's calculate the model parameters using from stock data " + ], + "id": "829403a3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "option_params = get_option_parameters(ticker=ticker)" + ], + "execution_count": null, + "outputs": [], + "id": "25936449" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Model development - Heston Option price" + ], + "id": "0a0948b6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "class HestonModel:\n", + "\n", + " def __init__(self, ticker, expiration_date_str, calculation_date, spot_price, dividend_rate, risk_free_rate):\n", + " self.ticker = ticker\n", + " self.expiration_date_str = expiration_date_str,\n", + " self.calculation_date = calculation_date\n", + " self.spot_price = spot_price\n", + " self.dividend_rate = dividend_rate\n", + " self.risk_free_rate = risk_free_rate\n", + " \n", + " def predict_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " # Set the evaluation date\n", + " ql.Settings.instance().evaluationDate = self.calculation_date\n", + "\n", + " # Construct the European Option\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.EuropeanExercise(maturity_date)\n", + " european_option = ql.VanillaOption(payoff, exercise)\n", + "\n", + " # Yield term structures for risk-free rate and dividend\n", + " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", + " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", + "\n", + " # Initial stock price\n", + " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", + "\n", + " # Heston process parameters\n", + " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", + " hestonModel = ql.HestonModel(heston_process)\n", + "\n", + " # Use the Heston analytic engine\n", + " engine = ql.AnalyticHestonEngine(hestonModel)\n", + " european_option.setPricingEngine(engine)\n", + "\n", + " # Calculate the Heston model price\n", + " h_price = european_option.NPV()\n", + "\n", + " return h_price\n", + "\n", + " def predict_american_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " # Set the evaluation date\n", + " ql.Settings.instance().evaluationDate = self.calculation_date\n", + "\n", + " # Construct the American Option\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", + " american_option = ql.VanillaOption(payoff, exercise)\n", + "\n", + " # Yield term structures for risk-free rate and dividend\n", + " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", + " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", + "\n", + " # Initial stock price\n", + " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", + "\n", + " # Heston process parameters\n", + " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", + " heston_model = ql.HestonModel(heston_process)\n", + "\n", + "\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", + " american_option = ql.VanillaOption(payoff, exercise)\n", + " heston_fd_engine = ql.FdHestonVanillaEngine(heston_model)\n", + " american_option.setPricingEngine(heston_fd_engine)\n", + " option_price = american_option.NPV()\n", + "\n", + " return option_price\n", + "\n", + " def objective_function(self, params, market_data, spot_price, dividend_rate, risk_free_rate):\n", + " v0, theta, kappa, sigma, rho = params\n", + "\n", + " # Sum of squared differences between market prices and model prices\n", + " error = 0.0\n", + " for i, row in market_data.iterrows():\n", + " model_price = self.predict_option_price(row['strike'], row['maturity_date'], spot_price, \n", + " v0, theta, kappa, sigma, rho)\n", + " error += (model_price - row['option_price']) ** 2\n", + " \n", + " return error\n", + "\n", + " def calibrate_model(self, ticker, expiration_date_str):\n", + " # Get the option market data dynamically from Yahoo Finance\n", + " market_data = get_market_data(ticker, expiration_date_str)\n", + "\n", + " # Initial guesses for Heston parameters\n", + " initial_params = [0.04, 0.04, 0.1, 0.1, -0.75]\n", + "\n", + " # Bounds for the parameters to ensure realistic values\n", + " bounds = [(0.0001, 1.0), # v0\n", + " (0.0001, 1.0), # theta\n", + " (0.001, 2.0), # kappa\n", + " (0.001, 1.0), # sigma\n", + " (-0.75, 0.0)] # rho\n", + "\n", + " # Optimize the parameters to minimize the error between model and market prices\n", + " result = minimize(self.objective_function, initial_params, args=(market_data, self.spot_price, self.dividend_rate, self.risk_free_rate),\n", + " bounds=bounds, method='L-BFGS-B')\n", + "\n", + " # Optimized Heston parameters\n", + " v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = result.x\n", + "\n", + " return v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt\n" + ], + "execution_count": null, + "outputs": [], + "id": "e15b8221" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Model Calibration\n", + "* The calibration process aims to optimize the Heston model parameters (v0, theta, kappa, sigma, rho) by minimizing the difference between model-predicted option prices and observed market prices.\n", + "* In this implementation, the model is calibrated to current market data, specifically using option prices from the selected ticker and expiration date.\n", + "\n", + "Let's specify `calculation_date` and `strike_price` as input parameters for the model to verify its functionality and confirm it operates as expected." + ], + "id": "a941aa32" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "calculation_date = ql.Date(26, 11, 2024)\n", + "# Convert expiration date string to QuantLib.Date\n", + "expiry_date_parts = list(map(int, expiration_date.split('-')))\n", + "maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0])\n", + "strike_price = 460.0\n", + "\n", + "hm = HestonModel(\n", + " ticker=ticker,\n", + " expiration_date_str= expiration_date,\n", + " calculation_date= calculation_date,\n", + " spot_price= option_params['spot_price'],\n", + " dividend_rate = option_params['dividend_rate'],\n", + " risk_free_rate = option_params['risk_free_rate']\n", + ")\n", + "\n", + "# Let's calibrate model\n", + "v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = hm.calibrate_model(ticker, expiration_date)\n", + "print(f\"Optimized Heston parameters: v0={v0_opt}, theta={theta_opt}, kappa={kappa_opt}, sigma={sigma_opt}, rho={rho_opt}\")\n", + "\n", + "\n", + "# option price\n", + "h_price = hm.predict_option_price(strike_price, maturity_date, option_params['spot_price'], v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt)\n", + "print(\"The Heston model price for the option is:\", h_price)" + ], + "execution_count": null, + "outputs": [], + "id": "1d61dfca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Model Evaluation" + ], + "id": "75313272" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_1__'></a>\n", + "\n", + "#### Benchmark Testing\n", + "The benchmark testing framework provides a robust way to validate the Heston model implementation and understand the relationships between European and American option prices under stochastic volatility conditions.\n", + "Let's compares European and American option prices using the Heston model." + ], + "id": "2e6471ef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", + "def benchmark_test(hm_model, strikes, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " \"\"\"\n", + " Compares European and American option prices using the Heston model.\n", + "\n", + " This test evaluates the price differences between European and American options\n", + " across multiple strike prices while keeping other parameters constant. The comparison\n", + " helps understand the early exercise premium of American options over their European\n", + " counterparts under stochastic volatility conditions.\n", + "\n", + " Args:\n", + " hm_model: HestonModel instance for option pricing calculations\n", + " strikes (list[float]): List of strike prices to test\n", + " maturity_date (ql.Date): Option expiration date in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0 (float, optional): Initial variance. Defaults to None.\n", + " theta (float, optional): Long-term variance. Defaults to None.\n", + " kappa (float, optional): Mean reversion rate. Defaults to None.\n", + " sigma (float, optional): Volatility of variance. Defaults to None.\n", + " rho (float, optional): Correlation between asset and variance. Defaults to None.\n", + "\n", + " Returns:\n", + " dict: Contains a DataFrame with the following columns:\n", + " - Strike: Strike prices tested\n", + " - Maturity date: Expiration date for all options\n", + " - Spot price: Current underlying price\n", + " - european model price: Prices for European options\n", + " - american model price: Prices for American options\n", + "\"\"\"\n", + " american_derived_prices = []\n", + " european_derived_prices = []\n", + " for K in strikes:\n", + " european_derived_prices.append(hm_model.predict_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", + " american_derived_prices.append(hm_model.predict_american_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", + "\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturity date\": [maturity_date] * len(strikes),\n", + " \"Spot price\": [spot_price] * len(strikes),\n", + " \"european model price\": european_derived_prices,\n", + " \"american model price\": american_derived_prices,\n", + "\n", + " }\n", + " df1 = pd.DataFrame(data)\n", + " return {\"strikes variation benchmarking\": df1}" + ], + "execution_count": 15, + "outputs": [], + "id": "810cf887" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.BenchmarkTest\",\n", + " params={\n", + " \"hm_model\": hm,\n", + " \"strikes\": [400, 425, 460, 495, 520],\n", + " \"maturity_date\": maturity_date,\n", + " \"spot_price\": option_params['spot_price'],\n", + " \"v0\":v0_opt,\n", + " \"theta\": theta_opt,\n", + " \"kappa\":kappa_opt ,\n", + " \"sigma\": sigma_opt,\n", + " \"rho\":rho_opt\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "3fdd6705" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_2__'></a>\n", + "\n", + "#### Sensitivity Testing\n", + "The sensitivity testing framework provides a systematic approach to understanding how the Heston model responds to parameter changes, which is crucial for both model validation and practical application in trading and risk management." + ], + "id": "e359b503" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_test_provider.Sensitivity\")\n", + "def SensitivityTest(\n", + " model,\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + "):\n", + " \"\"\"\n", + " Evaluates the sensitivity of American option prices to changes in model parameters.\n", + "\n", + " This test calculates option prices using the Heston model with optimized parameters.\n", + " It's designed to analyze how changes in various model inputs affect the option price,\n", + " which is crucial for understanding model behavior and risk management.\n", + "\n", + " Args:\n", + " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", + " strike_price (float): Strike price of the option\n", + " maturity_date (ql.Date): Expiration date of the option in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0_opt (float): Optimized initial variance parameter\n", + " theta_opt (float): Optimized long-term variance parameter\n", + " kappa_opt (float): Optimized mean reversion rate parameter\n", + " sigma_opt (float): Optimized volatility of variance parameter\n", + " rho_opt (float): Optimized correlation parameter between asset price and variance\n", + " \"\"\"\n", + " price = model.model.predict_american_option_price(\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + " )\n", + "\n", + " return price\n" + ], + "execution_count": null, + "outputs": [], + "id": "51922313" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Common plot function" + ], + "id": "408a05ef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def plot_results(df, params: dict = None):\n", + " fig2 = plt.figure(figsize=(10, 6))\n", + " plt.plot(df[params[\"x\"]], df[params[\"y\"]], label=params[\"label\"])\n", + " plt.xlabel(params[\"xlabel\"])\n", + " plt.ylabel(params[\"ylabel\"])\n", + " \n", + " plt.title(params[\"title\"])\n", + " plt.legend()\n", + " plt.grid(True)\n", + " plt.show() # close the plot to avoid displaying it" + ], + "execution_count": null, + "outputs": [], + "id": "104ca6dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's create ValidMind model object" + ], + "id": "ca72b9e5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "hm_model = vm.init_model(model=hm, input_id=\"HestonModel\")" + ], + "execution_count": null, + "outputs": [], + "id": "ae7093fa" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Strike sensitivity\n", + "Let's analyzes how option prices change as the strike price varies. We create a range of strike prices around the current strike (460) and observe the impact on option prices while keeping all other parameters constant." + ], + "id": "b2141640" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_test_provider.Sensitivity:ToStrike\",\n", + " inputs = {\n", + " \"model\": hm_model\n", + " },\n", + " param_grid={\n", + " \"strike_price\": list(np.linspace(460-50, 460+50, 10)),\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\":[rho_opt]\n", + " },\n", + ")\n", + "result.log()\n", + "# Visualize how option prices change with different strike prices\n", + "plot_results(\n", + " pd.DataFrame(result.tables[0].data),\n", + " params={\n", + " \"x\": \"strike_price\",\n", + " \"y\":\"Value\",\n", + " \"label\":\"Strike price\",\n", + " \"xlabel\":\"Strike price\",\n", + " \"ylabel\":\"option price\",\n", + " \"title\":\"Heston option - Strike price Sensitivity\",\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "ea7f1cbe" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_3__'></a>\n", + "\n", + "#### Stress Testing\n", + "This stress testing framework provides a comprehensive view of how the Heston model behaves under different market conditions and helps identify potential risks in option pricing." + ], + "id": "be143012" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Stressing\")\n", + "def StressTest(\n", + " model,\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + "):\n", + " \"\"\"\n", + " Performs stress testing on Heston model parameters to evaluate option price sensitivity.\n", + "\n", + " This test evaluates how the American option price responds to stressed market conditions\n", + " by varying key model parameters. It's designed to:\n", + " 1. Identify potential model vulnerabilities\n", + " 2. Understand price behavior under extreme scenarios\n", + " 3. Support risk management decisions\n", + " 4. Validate model stability across parameter ranges\n", + "\n", + " Args:\n", + " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", + " strike_price (float): Option strike price\n", + " maturity_date (ql.Date): Option expiration date in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0_opt (float): Initial variance parameter under stress testing\n", + " theta_opt (float): Long-term variance parameter under stress testing\n", + " kappa_opt (float): Mean reversion rate parameter under stress testing\n", + " sigma_opt (float): Volatility of variance parameter under stress testing\n", + " rho_opt (float): Correlation parameter under stress testing\n", + " \"\"\"\n", + " price = model.model.predict_american_option_price(\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + " )\n", + "\n", + " return price\n" + ], + "execution_count": null, + "outputs": [], + "id": "f2f01a40" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Theta (long term vol) stress test\n", + "Next, let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework." + ], + "id": "31fcbe9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.4, 5)),\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\":list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "6119b5d9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Sigma stress test\n", + "Let's evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options." + ], + "id": "be39cb3a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": list(np.linspace(0.1, sigma_opt+0.6, 5)),\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "0dc189b7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress kappa\n", + "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." + ], + "id": "173a5294" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheKappaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": list(np.linspace(kappa_opt, kappa_opt+0.2, 5)),\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "dae9714f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress theta\n", + "Let's evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model." + ], + "id": "b4d1d968" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheThetaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.9, 5)),\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "e68df3db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress rho\n", + "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." + ], + "id": "32e70456" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "b5ca3fc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "892c5347" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-de5d1e182b09403abddabc2850f2dd05" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 3da618e57..e5f487771 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -1,882 +1,888 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Quickstart for model code documentation\n", - "\n", - "Welcome! This notebook demonstrates how to use the ValidMind code explainer to automatically generate comprehensive documentation for your codebase. The code explainer analyzes your source code and provides detailed explanations across various aspects of your implementation.\n", - "\n", - "<a id='toc1__'></a>\n", - "\n", - "## About Code Explainer\n", - "The ValidMind code explainer is a powerful tool that automatically analyzes your source code and generates comprehensive documentation. It helps you:\n", - "\n", - "- Understand the structure and organization of your codebase\n", - "- Document dependencies and environment setup\n", - "- Explain data processing and model implementation details\n", - "- Document training, evaluation, and inference pipelines\n", - "- Track configuration, testing, and security measures\n", - "\n", - "This tool is particularly useful for:\n", - "- Onboarding new team members\n", - "- Maintaining up-to-date documentation\n", - "- Ensuring code quality and best practices\n", - "- Facilitating code reviews and audits" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About Code Explainer](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Preview the documentation template](#toc3_3__) \n", - "- [Common function](#toc4__) \n", - "- [Default Behavior](#toc5__) \n", - "- [Codebase Overview](#toc6__) \n", - "- [Environment and Dependencies ('environment_setup')](#toc7__) \n", - "- [Data Ingestion and Preprocessing](#toc8__) \n", - "- [Model Implementation Details](#toc9__) \n", - "- [Model Training Pipeline](#toc10__) \n", - "- [Evaluation and Validation Code](#toc11__) \n", - "- [Inference and Scoring Logic](#toc12__) \n", - "- [Configuration and Parameters](#toc13__) \n", - "- [Unit and Integration Testing](#toc14__) \n", - "- [Logging and Monitoring Hooks](#toc15__) \n", - "- [Code and Model Versioning](#toc16__) \n", - "- [Security and Access Control](#toc17__) \n", - "- [Example Runs and Scripts](#toc18__) \n", - "- [Known Issues and Future Improvements](#toc19__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Model Source Code Documentation`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"model_source_code_documentation_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Common function\n", - "The code above defines two key functions:\n", - "1. A function to read source code from 'customer_churn_full_suite.py' file\n", - "2. An 'explain_code' function that uses ValidMind's experimental agents to analyze and explain code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "source_code=\"\"\n", - "with open(\"customer_churn_full_suite.py\", \"r\") as f:\n", - " source_code = f.read()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `vm.experimental.agents.run_task` function is used to execute AI agent tasks.\n", - "\n", - "It requires:\n", - "- task: The type of task to run (e.g. `code_explainer`)\n", - "- input: A dictionary containing task-specific parameters\n", - " - For `code_explainer`, this includes:\n", - " - **source_code** (str): The code to be analyzed\n", - " - **user_instructions** (str): Instructions for how to analyze the code" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def explain_code(content_id: str, user_instructions: str):\n", - " \"\"\"Run code explanation task and log the results.\n", - " By default, the code explainer includes sections for:\n", - " - Main Purpose and Overall Functionality\n", - " - Breakdown of Key Functions or Components\n", - " - Potential Risks or Failure Points \n", - " - Assumptions or Limitations\n", - " If you want default sections, specify user_instructions as an empty string.\n", - " \n", - " Args:\n", - " user_instructions (str): Instructions for how to analyze the code\n", - " content_id (str): ID to use when logging the results\n", - " \n", - " Returns:\n", - " The result object from running the code explanation task\n", - " \"\"\"\n", - " result = vm.experimental.agents.run_task(\n", - " task=\"code_explainer\",\n", - " input={\n", - " \"source_code\": source_code,\n", - " \"user_instructions\": user_instructions\n", - " }\n", - " )\n", - " result.log(content_id=content_id)\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='defaultBehavior'></a>\n", - "\n", - "<a id='toc5__'></a>\n", - "\n", - "## Default Behavior" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "By default, the code explainer includes sections for:\n", - "- Main Purpose and Overall Functionality\n", - "- Breakdown of Key Functions or Components\n", - "- Potential Risks or Failure Points \n", - "- Assumptions or Limitations\n", - "\n", - "If you want default sections, specify `user_instructions` as an empty string. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.experimental.agents.run_task(\n", - " task=\"code_explainer\",\n", - " input={\n", - " \"source_code\": source_code,\n", - " \"user_instructions\": \"\"\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='overview'></a>\n", - "\n", - "<a id='toc6__'></a>\n", - "\n", - "## Codebase Overview\n", - "\n", - "Let's analyze your codebase structure to understand the main modules, components, entry points and their relationships. We'll also examine the technology stack and frameworks that are being utilized in the implementation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe the overall structure of the source code repository.\n", - " - Identify main modules, folders, and scripts.\n", - " - Highlight entry points for training, inference, and evaluation.\n", - " - State the main programming languages and frameworks used.\n", - " \"\"\",\n", - " content_id=\"code_structure_summary\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\",\n", - " content_id=\"code_structure_summary\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='environment'></a>\n", - "\n", - "<a id='toc7__'></a>\n", - "\n", - "## Environment and Dependencies ('environment_setup')\n", - "Let's document the technical requirements and setup needed to run your code, including Python packages, system dependencies, and environment configuration files. Understanding these requirements is essential for proper development environment setup and consistent deployments across different environments." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List Python packages and system dependencies (OS, compilers, etc.).\n", - " - Reference environment files (requirements.txt, environment.yml, Dockerfile).\n", - " - Include setup instructions using Conda, virtualenv, or containers.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"setup_instructions\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='data'></a>\n", - "\n", - "<a id='toc8__'></a>\n", - "\n", - "## Data Ingestion and Preprocessing\n", - "Let's document how your code handles data, including data sources, validation procedures, and preprocessing steps. We'll examine the data pipeline architecture, covering everything from initial data loading through feature engineering and quality checks." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Specify data input formats and sources.\n", - " - Document ingestion, validation, and transformation logic.\n", - " - Explain how raw data is preprocessed and features are generated.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections. \"\"\",\n", - " content_id=\"data_handling_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='model'> </a>\n", - "\n", - "<a id='toc9__'></a>\n", - "\n", - "## Model Implementation Details\n", - "Let's document the core implementation details of your model, including its architecture, components, and key algorithms. Understanding the technical implementation is crucial for maintenance, debugging, and future improvements to the codebase. We'll examine how theoretical concepts are translated into working code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe the core model code structure (classes, functions).\n", - " - Link code to theoretical models or equations when applicable.\n", - " - Note custom components like loss functions or feature selectors.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"model_code_description\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='training'></a>\n", - "\n", - "<a id='toc10__'></a>\n", - "\n", - "## Model Training Pipeline\n", - "\n", - "Let's document the training pipeline implementation, including how models are trained, optimized and evaluated. We'll examine the training process workflow, hyperparameter tuning approach, and model checkpointing mechanisms. This section provides insights into how the model learns from data and achieves optimal performance." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Explain the training process, optimization strategy, and hyperparameters.\n", - " - Describe logging, checkpointing, and early stopping mechanisms.\n", - " - Include references to training config files or tuning logic.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"training_logic_details\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='evaluation'></a>\n", - "\n", - "<a id='toc11__'></a>\n", - "\n", - "## Evaluation and Validation Code\n", - "Let's examine how the model's validation and evaluation code is implemented, including the metrics calculation and validation processes. We'll explore the diagnostic tools and visualization methods used to assess model performance. This section will also cover how validation results are logged and stored for future reference." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe how validation is implemented and metrics are calculated.\n", - " - Include plots and diagnostic tools (e.g., ROC, SHAP, confusion matrix).\n", - " - State how outputs are logged and persisted.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"evaluation_logic_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='inference'></a>\n", - "\n", - "<a id='toc12__'></a>\n", - "\n", - "## Inference and Scoring Logic\n", - "Let's examine how the model performs inference and scoring on new data. This section will cover the implementation details of loading trained models, making predictions, and any required pre/post-processing steps. We'll also look at the APIs and interfaces available for both real-time serving and batch scoring scenarios." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Detail how the trained model is loaded and used for predictions.\n", - " - Explain I/O formats and APIs for serving or batch scoring.\n", - " - Include any preprocessing/postprocessing logic required.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"inference_mechanism\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='config'></a>\n", - "\n", - "<a id='toc13__'></a>\n", - "\n", - "## Configuration and Parameters\n", - "Let's explore how configuration and parameters are managed in the codebase. We'll examine the configuration files, command-line arguments, environment variables, and other mechanisms used to control model behavior. This section will also cover parameter versioning and how different configurations are tracked across model iterations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe configuration management (files, CLI args, env vars).\n", - " - Highlight default parameters and override mechanisms.\n", - " - Reference versioning practices for config files.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"config_control_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='testing'></a>\n", - "\n", - "<a id='toc14__'></a>\n", - "\n", - "## Unit and Integration Testing\n", - "Let's examine the testing strategy and implementation in the codebase. We'll analyze the unit tests, integration tests, and testing frameworks used to ensure code quality and reliability. This section will also cover test coverage metrics and continuous integration practices." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List unit and integration tests and what they cover.\n", - " - Mention testing frameworks and coverage tools used.\n", - " - Explain testing strategy for production-readiness.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"test_strategy_overview\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='logging'></a>\n", - "\n", - "<a id='toc15__'></a>\n", - "\n", - "## Logging and Monitoring Hooks\n", - "Let's analyze how logging and monitoring are implemented in the codebase. We'll examine the logging configuration, monitoring hooks, and key metrics being tracked. This section will also cover any real-time observability integrations and alerting mechanisms in place." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe logging configuration and structure.\n", - " - Highlight real-time monitoring or observability integrations.\n", - " - List key events, metrics, or alerts tracked.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"logging_monitoring_notes\"\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='versioning'></a>\n", - "\n", - "<a id='toc16__'></a>\n", - "\n", - "## Code and Model Versioning\n", - "Let's examine how code and model versioning is managed in the codebase. This section will cover version control practices, including Git workflows and model artifact versioning tools like DVC or MLflow. We'll also look at how versioning integrates with the CI/CD pipeline." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe Git usage, branching, tagging, and commit standards.\n", - " - Include model artifact versioning practices (e.g., DVC, MLflow).\n", - " - Reference any automation in CI/CD.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"version_tracking_description\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='security'></a>\n", - "\n", - "<a id='toc17__'></a>\n", - "\n", - "## Security and Access Control\n", - "Let's analyze the security and access control measures implemented in the codebase. We'll examine how sensitive data and code are protected through access controls, encryption, and compliance measures. Additionally, we'll review secure deployment practices and any specific handling of PII data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Document access controls for source code and data.\n", - " - Include any encryption, PII handling, or compliance measures.\n", - " - Mention secure deployment practices.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"security_policies_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='examples'></a>\n", - "\n", - "<a id='toc18__'></a>\n", - "\n", - "## Example Runs and Scripts\n", - "Let's explore example runs and scripts that demonstrate how to use this codebase in practice. We'll look at working examples, command-line usage, and sample notebooks that showcase the core functionality. This section will also point to demo datasets and test scenarios that can help new users get started quickly." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Provide working script examples.\n", - " - Include CLI usage instructions or sample notebooks.\n", - " - Link to demo datasets or test scenarios.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"runnable_examples\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='issues'></a>\n", - "\n", - "<a id='toc19__'></a>\n", - "\n", - "## Known Issues and Future Improvements\n", - "Let's examine the current limitations and areas for improvement in the codebase. This section will document known technical debt, bugs, and feature gaps that need to be addressed. We'll also outline proposed enhancements and reference any existing tickets or GitHub issues tracking these improvements." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List current limitations or technical debt.\n", - " - Outline proposed enhancements or refactors.\n", - " - Reference relevant tickets, GitHub issues, or roadmap items.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"issues_and_improvements_log\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-72ed6e2a48984af3aca5888b96d1f6b6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.11", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for model code documentation\n", + "\n", + "Welcome! This notebook demonstrates how to use the ValidMind code explainer to automatically generate comprehensive documentation for your codebase. The code explainer analyzes your source code and provides detailed explanations across various aspects of your implementation.\n", + "\n", + "<a id='toc1__'></a>\n", + "\n", + "## About Code Explainer\n", + "The ValidMind code explainer is a powerful tool that automatically analyzes your source code and generates comprehensive documentation. It helps you:\n", + "\n", + "- Understand the structure and organization of your codebase\n", + "- Document dependencies and environment setup\n", + "- Explain data processing and model implementation details\n", + "- Document training, evaluation, and inference pipelines\n", + "- Track configuration, testing, and security measures\n", + "\n", + "This tool is particularly useful for:\n", + "- Onboarding new team members\n", + "- Maintaining up-to-date documentation\n", + "- Ensuring code quality and best practices\n", + "- Facilitating code reviews and audits" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About Code Explainer](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Preview the documentation template](#toc3_3__) \n", + "- [Common function](#toc4__) \n", + "- [Default Behavior](#toc5__) \n", + "- [Codebase Overview](#toc6__) \n", + "- [Environment and Dependencies ('environment_setup')](#toc7__) \n", + "- [Data Ingestion and Preprocessing](#toc8__) \n", + "- [Model Implementation Details](#toc9__) \n", + "- [Model Training Pipeline](#toc10__) \n", + "- [Evaluation and Validation Code](#toc11__) \n", + "- [Inference and Scoring Logic](#toc12__) \n", + "- [Configuration and Parameters](#toc13__) \n", + "- [Unit and Integration Testing](#toc14__) \n", + "- [Logging and Monitoring Hooks](#toc15__) \n", + "- [Code and Model Versioning](#toc16__) \n", + "- [Security and Access Control](#toc17__) \n", + "- [Example Runs and Scripts](#toc18__) \n", + "- [Known Issues and Future Improvements](#toc19__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Model Source Code Documentation`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"model_source_code_documentation_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Common function\n", + "The code above defines two key functions:\n", + "1. A function to read source code from 'customer_churn_full_suite.py' file\n", + "2. An 'explain_code' function that uses ValidMind's experimental agents to analyze and explain code." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "source_code=\"\"\n", + "with open(\"customer_churn_full_suite.py\", \"r\") as f:\n", + " source_code = f.read()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `vm.experimental.agents.run_task` function is used to execute AI agent tasks.\n", + "\n", + "It requires:\n", + "- task: The type of task to run (e.g. `code_explainer`)\n", + "- input: A dictionary containing task-specific parameters\n", + " - For `code_explainer`, this includes:\n", + " - **source_code** (str): The code to be analyzed\n", + " - **user_instructions** (str): Instructions for how to analyze the code" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def explain_code(content_id: str, user_instructions: str):\n", + " \"\"\"Run code explanation task and log the results.\n", + " By default, the code explainer includes sections for:\n", + " - Main Purpose and Overall Functionality\n", + " - Breakdown of Key Functions or Components\n", + " - Potential Risks or Failure Points \n", + " - Assumptions or Limitations\n", + " If you want default sections, specify user_instructions as an empty string.\n", + " \n", + " Args:\n", + " user_instructions (str): Instructions for how to analyze the code\n", + " content_id (str): ID to use when logging the results\n", + " \n", + " Returns:\n", + " The result object from running the code explanation task\n", + " \"\"\"\n", + " result = vm.experimental.agents.run_task(\n", + " task=\"code_explainer\",\n", + " input={\n", + " \"source_code\": source_code,\n", + " \"user_instructions\": user_instructions\n", + " }\n", + " )\n", + " result.log(content_id=content_id)\n", + " return result" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='defaultBehavior'></a>\n", + "\n", + "<a id='toc5__'></a>\n", + "\n", + "## Default Behavior" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "By default, the code explainer includes sections for:\n", + "- Main Purpose and Overall Functionality\n", + "- Breakdown of Key Functions or Components\n", + "- Potential Risks or Failure Points \n", + "- Assumptions or Limitations\n", + "\n", + "If you want default sections, specify `user_instructions` as an empty string. For example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.experimental.agents.run_task(\n", + " task=\"code_explainer\",\n", + " input={\n", + " \"source_code\": source_code,\n", + " \"user_instructions\": \"\"\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='overview'></a>\n", + "\n", + "<a id='toc6__'></a>\n", + "\n", + "## Codebase Overview\n", + "\n", + "Let's analyze your codebase structure to understand the main modules, components, entry points and their relationships. We'll also examine the technology stack and frameworks that are being utilized in the implementation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe the overall structure of the source code repository.\n", + " - Identify main modules, folders, and scripts.\n", + " - Highlight entry points for training, inference, and evaluation.\n", + " - State the main programming languages and frameworks used.\n", + " \"\"\",\n", + " content_id=\"code_structure_summary\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\",\n", + " content_id=\"code_structure_summary\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='environment'></a>\n", + "\n", + "<a id='toc7__'></a>\n", + "\n", + "## Environment and Dependencies ('environment_setup')\n", + "Let's document the technical requirements and setup needed to run your code, including Python packages, system dependencies, and environment configuration files. Understanding these requirements is essential for proper development environment setup and consistent deployments across different environments." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List Python packages and system dependencies (OS, compilers, etc.).\n", + " - Reference environment files (requirements.txt, environment.yml, Dockerfile).\n", + " - Include setup instructions using Conda, virtualenv, or containers.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"setup_instructions\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='data'></a>\n", + "\n", + "<a id='toc8__'></a>\n", + "\n", + "## Data Ingestion and Preprocessing\n", + "Let's document how your code handles data, including data sources, validation procedures, and preprocessing steps. We'll examine the data pipeline architecture, covering everything from initial data loading through feature engineering and quality checks." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Specify data input formats and sources.\n", + " - Document ingestion, validation, and transformation logic.\n", + " - Explain how raw data is preprocessed and features are generated.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections. \"\"\",\n", + " content_id=\"data_handling_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='model'> </a>\n", + "\n", + "<a id='toc9__'></a>\n", + "\n", + "## Model Implementation Details\n", + "Let's document the core implementation details of your model, including its architecture, components, and key algorithms. Understanding the technical implementation is crucial for maintenance, debugging, and future improvements to the codebase. We'll examine how theoretical concepts are translated into working code." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe the core model code structure (classes, functions).\n", + " - Link code to theoretical models or equations when applicable.\n", + " - Note custom components like loss functions or feature selectors.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"model_code_description\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='training'></a>\n", + "\n", + "<a id='toc10__'></a>\n", + "\n", + "## Model Training Pipeline\n", + "\n", + "Let's document the training pipeline implementation, including how models are trained, optimized and evaluated. We'll examine the training process workflow, hyperparameter tuning approach, and model checkpointing mechanisms. This section provides insights into how the model learns from data and achieves optimal performance." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Explain the training process, optimization strategy, and hyperparameters.\n", + " - Describe logging, checkpointing, and early stopping mechanisms.\n", + " - Include references to training config files or tuning logic.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"training_logic_details\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='evaluation'></a>\n", + "\n", + "<a id='toc11__'></a>\n", + "\n", + "## Evaluation and Validation Code\n", + "Let's examine how the model's validation and evaluation code is implemented, including the metrics calculation and validation processes. We'll explore the diagnostic tools and visualization methods used to assess model performance. This section will also cover how validation results are logged and stored for future reference." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe how validation is implemented and metrics are calculated.\n", + " - Include plots and diagnostic tools (e.g., ROC, SHAP, confusion matrix).\n", + " - State how outputs are logged and persisted.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"evaluation_logic_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='inference'></a>\n", + "\n", + "<a id='toc12__'></a>\n", + "\n", + "## Inference and Scoring Logic\n", + "Let's examine how the model performs inference and scoring on new data. This section will cover the implementation details of loading trained models, making predictions, and any required pre/post-processing steps. We'll also look at the APIs and interfaces available for both real-time serving and batch scoring scenarios." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Detail how the trained model is loaded and used for predictions.\n", + " - Explain I/O formats and APIs for serving or batch scoring.\n", + " - Include any preprocessing/postprocessing logic required.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"inference_mechanism\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='config'></a>\n", + "\n", + "<a id='toc13__'></a>\n", + "\n", + "## Configuration and Parameters\n", + "Let's explore how configuration and parameters are managed in the codebase. We'll examine the configuration files, command-line arguments, environment variables, and other mechanisms used to control model behavior. This section will also cover parameter versioning and how different configurations are tracked across model iterations." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe configuration management (files, CLI args, env vars).\n", + " - Highlight default parameters and override mechanisms.\n", + " - Reference versioning practices for config files.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"config_control_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='testing'></a>\n", + "\n", + "<a id='toc14__'></a>\n", + "\n", + "## Unit and Integration Testing\n", + "Let's examine the testing strategy and implementation in the codebase. We'll analyze the unit tests, integration tests, and testing frameworks used to ensure code quality and reliability. This section will also cover test coverage metrics and continuous integration practices." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List unit and integration tests and what they cover.\n", + " - Mention testing frameworks and coverage tools used.\n", + " - Explain testing strategy for production-readiness.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"test_strategy_overview\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='logging'></a>\n", + "\n", + "<a id='toc15__'></a>\n", + "\n", + "## Logging and Monitoring Hooks\n", + "Let's analyze how logging and monitoring are implemented in the codebase. We'll examine the logging configuration, monitoring hooks, and key metrics being tracked. This section will also cover any real-time observability integrations and alerting mechanisms in place." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe logging configuration and structure.\n", + " - Highlight real-time monitoring or observability integrations.\n", + " - List key events, metrics, or alerts tracked.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"logging_monitoring_notes\"\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='versioning'></a>\n", + "\n", + "<a id='toc16__'></a>\n", + "\n", + "## Code and Model Versioning\n", + "Let's examine how code and model versioning is managed in the codebase. This section will cover version control practices, including Git workflows and model artifact versioning tools like DVC or MLflow. We'll also look at how versioning integrates with the CI/CD pipeline." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe Git usage, branching, tagging, and commit standards.\n", + " - Include model artifact versioning practices (e.g., DVC, MLflow).\n", + " - Reference any automation in CI/CD.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"version_tracking_description\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='security'></a>\n", + "\n", + "<a id='toc17__'></a>\n", + "\n", + "## Security and Access Control\n", + "Let's analyze the security and access control measures implemented in the codebase. We'll examine how sensitive data and code are protected through access controls, encryption, and compliance measures. Additionally, we'll review secure deployment practices and any specific handling of PII data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Document access controls for source code and data.\n", + " - Include any encryption, PII handling, or compliance measures.\n", + " - Mention secure deployment practices.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"security_policies_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='examples'></a>\n", + "\n", + "<a id='toc18__'></a>\n", + "\n", + "## Example Runs and Scripts\n", + "Let's explore example runs and scripts that demonstrate how to use this codebase in practice. We'll look at working examples, command-line usage, and sample notebooks that showcase the core functionality. This section will also point to demo datasets and test scenarios that can help new users get started quickly." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Provide working script examples.\n", + " - Include CLI usage instructions or sample notebooks.\n", + " - Link to demo datasets or test scenarios.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"runnable_examples\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='issues'></a>\n", + "\n", + "<a id='toc19__'></a>\n", + "\n", + "## Known Issues and Future Improvements\n", + "Let's examine the current limitations and areas for improvement in the codebase. This section will document known technical debt, bugs, and feature gaps that need to be addressed. We'll also outline proposed enhancements and reference any existing tickets or GitHub issues tracking these improvements." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List current limitations or technical debt.\n", + " - Outline proposed enhancements or refactors.\n", + " - Reference relevant tickets, GitHub issues, or roadmap items.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"issues_and_improvements_log\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-72ed6e2a48984af3aca5888b96d1f6b6" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 8ef764021..6584ceb06 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -1,391 +1,397 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Document the model](#toc3__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your model documentation](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **[template]{.smallcaps}**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Document the model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club\n", + "from validmind.utils import preview_test_config\n", + "\n", + "scorecard = lending_club.load_scorecard()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "lending_club.init_vm_objects(scorecard)" + ], + "execution_count": 4, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = lending_club.load_test_config(scorecard)\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc4_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc4_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-382e83e3fe1d4928ae90c3917480d27d" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Document the model](#toc3__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your model documentation](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **[template]{.smallcaps}**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Document the model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club\n", - "from validmind.utils import preview_test_config\n", - "\n", - "scorecard = lending_club.load_scorecard()" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "lending_club.init_vm_objects(scorecard)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = lending_club.load_test_config(scorecard)\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc4_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc4_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-382e83e3fe1d4928ae90c3917480d27d", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index 73e1726f6..bd4ade621 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -1,916 +1,922 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - " - [Feature engineering](#toc3_2__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Document the model](#toc5__) \n", - " - [Initialize the ValidMind datasets](#toc5_1__) \n", - " - [Initialize ValidMind models](#toc5_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", - " - [Compute credit risk scores](#toc5_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", - " - [Run the full suite of tests](#toc5_6__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "- [Upgrade ValidMind](#toc7__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", - "\n", - "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Feature engineering\n", - "\n", - "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", - "\n", - "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the Random Forest model\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50, \n", - " random_state=42,\n", - ")\n", - "\n", - "# Fit the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a model object you can use for testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")\n", - "\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# XGBoost\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Random Forest\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_6__'></a>\n", - "\n", - "### Run the full suite of tests\n", - "\n", - "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", - "\n", - "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - "```python\n", - "config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - "}\n", - "```\n", - "\n", - "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = lending_club.get_demo_test_config(x_test, y_test)\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a658e3f1bece47cabc255c03460e255f", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + " - [Feature engineering](#toc3_2__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Document the model](#toc5__) \n", + " - [Initialize the ValidMind datasets](#toc5_1__) \n", + " - [Initialize ValidMind models](#toc5_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", + " - [Compute credit risk scores](#toc5_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", + " - [Run the full suite of tests](#toc5_6__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "- [Upgrade ValidMind](#toc7__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", + "\n", + "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Feature engineering\n", + "\n", + "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", + "\n", + "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the Random Forest model\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50, \n", + " random_state=42,\n", + ")\n", + "\n", + "# Fit the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a model object you can use for testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")\n", + "\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# XGBoost\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Random Forest\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_6__'></a>\n", + "\n", + "### Run the full suite of tests\n", + "\n", + "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", + "\n", + "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + "```python\n", + "config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + "}\n", + "```\n", + "\n", + "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = lending_club.get_demo_test_config(x_test, y_test)\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a658e3f1bece47cabc255c03460e255f" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index 4824a1144..75b2d030a 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -1,1561 +1,1567 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a credit risk model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Postprocess the dataset](#toc5__) \n", - "- [Document the model](#toc6__) \n", - " - [Initialize the ValidMind datasets](#toc6_1__) \n", - " - [Initialize the ValidMind model](#toc6_2__) \n", - " - [Assign predictions](#toc6_3__) \n", - " - [Run tests](#toc6_4__) \n", - " - [Data description](#toc6_4_1__) \n", - " - [Data quality](#toc6_4_2__) \n", - " - [Correlations](#toc6_4_3__) \n", - " - [Model training](#toc6_4_4__) \n", - " - [Model validation](#toc6_4_5__) \n", - " - [Model explainability](#toc6_4_6__) \n", - " - [Bias and fairness](#toc6_4_7__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip -q install aequitas fairlearn vl-convert-python" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "from sklearn.preprocessing import OneHotEncoder, StandardScaler\n", - "from sklearn.pipeline import Pipeline\n", - "from sklearn.impute import SimpleImputer\n", - "from sklearn.compose import ColumnTransformer\n", - "from sklearn.compose import make_column_selector as selector\n", - "\n", - "from validmind.tests import run_test\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club_bias as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our credit decision model. \n", - "\n", - "We will in this example, create new feature, fill missing values and encode categorical variables." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = demo_dataset.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data into training and testing sets (`train_df`, `test_df`). \n", - "- We employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data into training and testing sets\n", - "train_df, test_df = demo_dataset.split(preprocess_df)\n", - "\n", - "X_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "X_test = test_df.drop(demo_dataset.target_column, axis=1)\n", - "y_test = test_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Train a Random Forest Classifier\n", - "model = RandomForestClassifier(n_estimators=50, random_state=42)\n", - "model.fit(X_train, y_train)\n", - "\n", - "# Print feature importances\n", - "feature_importances = pd.DataFrame({\n", - " 'feature': X_train.columns,\n", - " 'importance': model.feature_importances_\n", - "}).sort_values('importance', ascending=False)\n", - "\n", - "print(\"Feature Importances:\")\n", - "print(feature_importances)\n", - "\n", - "# Print model parameters\n", - "print(\"\\nModel Parameters:\")\n", - "print(model.get_params())\n", - "\n", - "# Print basic model information\n", - "print(f\"\\nNumber of trees: {model.n_estimators}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_probabilities = model.predict_proba(X_train)[:,1]\n", - "test_probabilities = model.predict_proba(X_test)[:,1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.5\n", - "train_binary_predictions = (train_probabilities > cut_off_threshold).astype(int)\n", - "test_binary_predictions = (test_probabilities > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Postprocess the dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Save the original labels for the protected classes for visualizations and investigation of biased outcomes\n", - "protected_classes_df = df[demo_dataset.protected_classes]\n", - "\n", - "train_df = train_df.merge(\n", - " protected_classes_df,\n", - " left_index=True,\n", - " right_index=True,\n", - ")\n", - "\n", - "test_df = test_df.merge(\n", - " protected_classes_df,\n", - " left_index=True,\n", - " right_index=True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Extract feature columns\n", - "feature_columns = train_df.drop(\n", - " columns=[demo_dataset.target_column] + demo_dataset.protected_classes\n", - ").columns.tolist()\n", - "feature_columns" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds= vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"random_forest_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - " prediction_values=train_binary_predictions,\n", - " prediction_probabilities=train_probabilities,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - " prediction_values=test_binary_predictions,\n", - " prediction_probabilities=test_probabilities,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Run tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_1__'></a>\n", - "\n", - "#### Data description" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetDescription\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\"\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TargetRateBarPlots\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\"\n", - " },\n", - " params={\n", - " \"default_column\": demo_dataset.target_column,\n", - " \"columns\": None,\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_2__'></a>\n", - "\n", - "#### Data quality" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.Duplicates\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 1\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.HighCardinality\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"num_threshold\": 100,\n", - " \"percent_threshold\": 0.1,\n", - " \"threshold_type\": \"percent\"\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.MissingValues\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.Skewness\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.UniqueRows\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TooManyZeroValues\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_percent_threshold\": 0.03,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.IQROutliersTable\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"threshold\": 1.5,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"threshold\": 1.5,\n", - " \"fig_width\": 800,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_3__'></a>\n", - "\n", - "#### Correlations" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.3\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_4__'></a>\n", - "\n", - "#### Model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_5__'></a>\n", - "\n", - "#### Model validation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"num_bins\": 10,\n", - " \"mode\": \"fixed\"\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"train_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metrics\": [\"accuracy\", \"precision\", \"recall\", \"f1\"],\n", - " \"max_threshold\": 0.1\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", - " input_grid={\n", - " \"model\": [vm_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_6__'></a>\n", - "\n", - "#### Model explainability" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"fontsize\": None,\n", - " \"figure_height\": 1000\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - "\"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - "inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"train_dataset\",\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"features_columns\": None,\n", - " \"thresholds\": {\n", - " \"accuracy\": 0.75,\n", - " \"precision\": 0.5,\n", - " \"recall\": 0.5,\n", - " \"f1\": 0.7\n", - " }\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metric\": None,\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metric\": None,\n", - " \"scaling_factor_std_dev_list\": [0.1, 0.2, 0.3, 0.4, 0.5],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_7__'></a>\n", - "\n", - "#### Bias and fairness" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = run_test(\n", - " \"validmind.data_validation.ProtectedClassesDescription\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\"\n", - " },\n", - " params={\n", - " 'protected_classes': demo_dataset.protected_classes\n", - " })\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: FNR/FPR across different groups. The `aequitas` plot module exposes the `disparities_metrics()` plot, which displays both the disparities and the group-wise metric results side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = run_test(\n", - " \"validmind.data_validation.ProtectedClassesDisparity\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"random_forest_model\"\n", - " },\n", - " params={\n", - " \"protected_classes\": demo_dataset.protected_classes,\n", - " \"disparity_tolerance\": 1.25,\n", - " \"metrics\": [\"fnr\", \"fpr\", \"tpr\"]\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ProtectedClassesCombination\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"random_forest_model\"\n", - " },\n", - " params={\n", - " \"protected_classes\": demo_dataset.protected_classes\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following code defines a preprocessing `Pipeline` that handles both numeric and categorical features. Numeric data is imputed and scaled, while categorical data is imputed with the most frequent value and one-hot encoded. The pipelines are then combined using a `ColumnTransformer` and integrated with a classifier." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define a pipeline for numeric features\n", - "numeric_transformer = Pipeline(\n", - " steps=[\n", - " (\"impute\", SimpleImputer()), # Impute missing values\n", - " (\"scaler\", StandardScaler()), # Scale numeric features\n", - " ]\n", - ")\n", - "\n", - "# Define a pipeline for categorical features\n", - "categorical_transformer = Pipeline(\n", - " [\n", - " (\"impute\", SimpleImputer(strategy=\"most_frequent\")), # Impute missing values with most frequent\n", - " (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\")), # One-hot encode categorical features\n", - " ]\n", - ")\n", - "\n", - "# Combine numeric and categorical pipelines\n", - "preprocessor = ColumnTransformer(\n", - " transformers=[\n", - " (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")), # Apply numeric transformer to non-categorical columns\n", - " (\"cat\", categorical_transformer, selector(dtype_include=\"category\")), # Apply categorical transformer to categorical columns\n", - " ]\n", - ")\n", - "\n", - "# Create the full pipeline including preprocessing and classification\n", - "pipeline = Pipeline(\n", - " steps=[\n", - " (\"preprocessor\", preprocessor), # Apply the preprocessor\n", - " (\n", - " \"classifier\",\n", - " model, # Use the previously defined model for classification\n", - " ),\n", - " ]\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sensitive_features = ['Gender_encoded','Race_encoded','Marital_Status_encoded']\n", - "\n", - "run_test(\n", - " \"validmind.data_validation.ProtectedClassesThresholdOptimizer\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds\n", - " },\n", - " params={\n", - " \"pipeline\":pipeline,\n", - " \"protected_classes\": sensitive_features,\n", - " \"X_train\":X_train,\n", - " \"y_train\":y_train,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-6a74bc76beda4633a0cfff2eaa20949e", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a credit risk model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Postprocess the dataset](#toc5__) \n", + "- [Document the model](#toc6__) \n", + " - [Initialize the ValidMind datasets](#toc6_1__) \n", + " - [Initialize the ValidMind model](#toc6_2__) \n", + " - [Assign predictions](#toc6_3__) \n", + " - [Run tests](#toc6_4__) \n", + " - [Data description](#toc6_4_1__) \n", + " - [Data quality](#toc6_4_2__) \n", + " - [Correlations](#toc6_4_3__) \n", + " - [Model training](#toc6_4_4__) \n", + " - [Model validation](#toc6_4_5__) \n", + " - [Model explainability](#toc6_4_6__) \n", + " - [Bias and fairness](#toc6_4_7__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip -q install aequitas fairlearn vl-convert-python" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "from sklearn.preprocessing import OneHotEncoder, StandardScaler\n", + "from sklearn.pipeline import Pipeline\n", + "from sklearn.impute import SimpleImputer\n", + "from sklearn.compose import ColumnTransformer\n", + "from sklearn.compose import make_column_selector as selector\n", + "\n", + "from validmind.tests import run_test\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club_bias as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our credit decision model. \n", + "\n", + "We will in this example, create new feature, fill missing values and encode categorical variables." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = demo_dataset.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data into training and testing sets (`train_df`, `test_df`). \n", + "- We employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data into training and testing sets\n", + "train_df, test_df = demo_dataset.split(preprocess_df)\n", + "\n", + "X_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "X_test = test_df.drop(demo_dataset.target_column, axis=1)\n", + "y_test = test_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Train a Random Forest Classifier\n", + "model = RandomForestClassifier(n_estimators=50, random_state=42)\n", + "model.fit(X_train, y_train)\n", + "\n", + "# Print feature importances\n", + "feature_importances = pd.DataFrame({\n", + " 'feature': X_train.columns,\n", + " 'importance': model.feature_importances_\n", + "}).sort_values('importance', ascending=False)\n", + "\n", + "print(\"Feature Importances:\")\n", + "print(feature_importances)\n", + "\n", + "# Print model parameters\n", + "print(\"\\nModel Parameters:\")\n", + "print(model.get_params())\n", + "\n", + "# Print basic model information\n", + "print(f\"\\nNumber of trees: {model.n_estimators}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_probabilities = model.predict_proba(X_train)[:,1]\n", + "test_probabilities = model.predict_proba(X_test)[:,1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.5\n", + "train_binary_predictions = (train_probabilities > cut_off_threshold).astype(int)\n", + "test_binary_predictions = (test_probabilities > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Postprocess the dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Save the original labels for the protected classes for visualizations and investigation of biased outcomes\n", + "protected_classes_df = df[demo_dataset.protected_classes]\n", + "\n", + "train_df = train_df.merge(\n", + " protected_classes_df,\n", + " left_index=True,\n", + " right_index=True,\n", + ")\n", + "\n", + "test_df = test_df.merge(\n", + " protected_classes_df,\n", + " left_index=True,\n", + " right_index=True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Extract feature columns\n", + "feature_columns = train_df.drop(\n", + " columns=[demo_dataset.target_column] + demo_dataset.protected_classes\n", + ").columns.tolist()\n", + "feature_columns" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds= vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"random_forest_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + " prediction_values=train_binary_predictions,\n", + " prediction_probabilities=train_probabilities,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + " prediction_values=test_binary_predictions,\n", + " prediction_probabilities=test_probabilities,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Run tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_1__'></a>\n", + "\n", + "#### Data description" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetDescription\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\"\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TargetRateBarPlots\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\"\n", + " },\n", + " params={\n", + " \"default_column\": demo_dataset.target_column,\n", + " \"columns\": None,\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_2__'></a>\n", + "\n", + "#### Data quality" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.Duplicates\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 1\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.HighCardinality\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"num_threshold\": 100,\n", + " \"percent_threshold\": 0.1,\n", + " \"threshold_type\": \"percent\"\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.MissingValues\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.Skewness\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.UniqueRows\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TooManyZeroValues\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_percent_threshold\": 0.03,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.IQROutliersTable\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"threshold\": 1.5,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"threshold\": 1.5,\n", + " \"fig_width\": 800,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_3__'></a>\n", + "\n", + "#### Correlations" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.3\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_4__'></a>\n", + "\n", + "#### Model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_5__'></a>\n", + "\n", + "#### Model validation" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"num_bins\": 10,\n", + " \"mode\": \"fixed\"\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"train_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metrics\": [\"accuracy\", \"precision\", \"recall\", \"f1\"],\n", + " \"max_threshold\": 0.1\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", + " input_grid={\n", + " \"model\": [vm_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_6__'></a>\n", + "\n", + "#### Model explainability" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"fontsize\": None,\n", + " \"figure_height\": 1000\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + "\"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + "inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"train_dataset\",\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"features_columns\": None,\n", + " \"thresholds\": {\n", + " \"accuracy\": 0.75,\n", + " \"precision\": 0.5,\n", + " \"recall\": 0.5,\n", + " \"f1\": 0.7\n", + " }\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metric\": None,\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metric\": None,\n", + " \"scaling_factor_std_dev_list\": [0.1, 0.2, 0.3, 0.4, 0.5],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_7__'></a>\n", + "\n", + "#### Bias and fairness" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = run_test(\n", + " \"validmind.data_validation.ProtectedClassesDescription\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\"\n", + " },\n", + " params={\n", + " 'protected_classes': demo_dataset.protected_classes\n", + " })\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: FNR/FPR across different groups. The `aequitas` plot module exposes the `disparities_metrics()` plot, which displays both the disparities and the group-wise metric results side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = run_test(\n", + " \"validmind.data_validation.ProtectedClassesDisparity\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"random_forest_model\"\n", + " },\n", + " params={\n", + " \"protected_classes\": demo_dataset.protected_classes,\n", + " \"disparity_tolerance\": 1.25,\n", + " \"metrics\": [\"fnr\", \"fpr\", \"tpr\"]\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ProtectedClassesCombination\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"random_forest_model\"\n", + " },\n", + " params={\n", + " \"protected_classes\": demo_dataset.protected_classes\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following code defines a preprocessing `Pipeline` that handles both numeric and categorical features. Numeric data is imputed and scaled, while categorical data is imputed with the most frequent value and one-hot encoded. The pipelines are then combined using a `ColumnTransformer` and integrated with a classifier." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define a pipeline for numeric features\n", + "numeric_transformer = Pipeline(\n", + " steps=[\n", + " (\"impute\", SimpleImputer()), # Impute missing values\n", + " (\"scaler\", StandardScaler()), # Scale numeric features\n", + " ]\n", + ")\n", + "\n", + "# Define a pipeline for categorical features\n", + "categorical_transformer = Pipeline(\n", + " [\n", + " (\"impute\", SimpleImputer(strategy=\"most_frequent\")), # Impute missing values with most frequent\n", + " (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\")), # One-hot encode categorical features\n", + " ]\n", + ")\n", + "\n", + "# Combine numeric and categorical pipelines\n", + "preprocessor = ColumnTransformer(\n", + " transformers=[\n", + " (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")), # Apply numeric transformer to non-categorical columns\n", + " (\"cat\", categorical_transformer, selector(dtype_include=\"category\")), # Apply categorical transformer to categorical columns\n", + " ]\n", + ")\n", + "\n", + "# Create the full pipeline including preprocessing and classification\n", + "pipeline = Pipeline(\n", + " steps=[\n", + " (\"preprocessor\", preprocessor), # Apply the preprocessor\n", + " (\n", + " \"classifier\",\n", + " model, # Use the previously defined model for classification\n", + " ),\n", + " ]\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "sensitive_features = ['Gender_encoded','Race_encoded','Marital_Status_encoded']\n", + "\n", + "run_test(\n", + " \"validmind.data_validation.ProtectedClassesThresholdOptimizer\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds\n", + " },\n", + " params={\n", + " \"pipeline\":pipeline,\n", + " \"protected_classes\": sensitive_features,\n", + " \"X_train\":X_train,\n", + " \"y_train\":y_train,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-6a74bc76beda4633a0cfff2eaa20949e" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index ab7c3243a..909cb355e 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -1,2011 +1,2017 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - " - [Feature engineering](#toc3_2__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Document the model](#toc5__) \n", - " - [Initialize the ValidMind datasets](#toc5_1__) \n", - " - [Initialize the ValidMind models](#toc5_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", - " - [Compute credit risk scores](#toc5_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", - " - [Raw data](#toc5_6__) \n", - " - [Pre-processed data](#toc5_7__) \n", - " - [Development data](#toc5_8__) \n", - " - [Feature selection](#toc5_9__) \n", - " - [Model training](#toc5_10__) \n", - " - [Model selection](#toc5_11__) \n", - " - [Class discrimination](#toc5_12__) \n", - " - [Classification accuracy](#toc5_13__) \n", - " - [Model diagnosis](#toc5_14__) \n", - " - [Model explainability](#toc5_15__) \n", - " - [Scoring evaluation](#toc5_16__) \n", - "- [Custom tests](#toc6__) \n", - " - [In-line custom tests](#toc6_1__) \n", - " - [Local test provider](#toc6_2__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "from validmind.tests import run_test\n", - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", - "\n", - "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Feature engineering\n", - "\n", - "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", - "\n", - "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the Random Forest model\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50, \n", - " random_state=42,\n", - ")\n", - "\n", - "# Fit the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")\n", - "\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# XGBoost\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Random Forest\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_6__'></a>\n", - "\n", - "### Raw data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DatasetDescription:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ClassImbalance:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.Duplicates:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighCardinality:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"num_threshold\": 100,\n", - " \"percent_threshold\": 0.1,\n", - " \"threshold_type\": \"percent\"\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.Skewness:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"max_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TooManyZeroValues:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"max_percent_threshold\": 0.03\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.IQROutliersTable:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"threshold\": 5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_7__'></a>\n", - "\n", - "### Pre-processed data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularDescriptionTables:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset,\n", - " },\n", - " params={\n", - " \"min_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TargetRateBarPlots:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - " params={\n", - " \"default_column\": lending_club.target_column,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_8__'></a>\n", - "\n", - "### Development data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularDescriptionTables:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ClassImbalance:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_9__'></a>\n", - "\n", - "### Feature selection" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MutualInformation:development_data\",\n", - " input_grid ={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.01,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.3,\n", - " \"top_n_correlations\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.WOEBinTable\",\n", - " input_grid={\n", - " \"dataset\": [vm_preprocess_dataset]\n", - " },\n", - " params={\n", - " \"breaks_adj\": lending_club.breaks_adj,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.WOEBinPlots\",\n", - " input_grid={\n", - " \"dataset\": [vm_preprocess_dataset]\n", - " },\n", - " params={\n", - " \"breaks_adj\": lending_club.breaks_adj,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_10__'></a>\n", - "\n", - "### Model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ModelParameters\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_11__'></a>\n", - "\n", - "### Model selection" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation:RandomForest\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_rf_model,\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.HyperParametersTuning\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_train_ds,\n", - " },\n", - " params={\n", - " \"param_grid\": {'n_estimators': [50, 100]},\n", - " \"scoring\": ['roc_auc', 'recall'],\n", - " \"fit_params\": {'eval_set': [(x_test, y_test)], 'verbose': False},\n", - " \"thresholds\": [0.3, 0.5],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_12__'></a>\n", - "\n", - "### Class discrimination" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"num_bins\": 10,\n", - " \"mode\": \"fixed\"\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_13__'></a>\n", - "\n", - "### Classification accuracy" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierThresholdOptimization\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_xgb_model\n", - " },\n", - " params={\n", - " \"target_recall\": 0.8 # Find a threshold that achieves a recall of 80%\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.CalibrationCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_14__'></a>\n", - "\n", - "### Model diagnosis" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"scaling_factor_std_dev_list\": [\n", - " 0.1,\n", - " 0.2,\n", - " 0.3,\n", - " 0.4,\n", - " 0.5\n", - " ],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_15__'></a>\n", - "\n", - "### Model explainability" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.FeaturesAUC\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_16__'></a>\n", - "\n", - "### Scoring evaluation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.ScorecardHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ScoreBandDefaultRates\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params = {\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ScoreProbabilityAlignment\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Custom tests\n", - "\n", - "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", - "\n", - "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### In-line custom tests\n", - "\n", - "The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ScoreToOdds\"`. The function `score_to_odds_analysis` takes three arguments `dataset`, `score_column`, and `score_bands`. This is a `VMDataset` and the rest are parameters that can be passed in." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "import plotly.graph_objects as go\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", - "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", - " \"\"\"\n", - " Analyzes the relationship between score bands and odds (good:bad ratio).\n", - " Good odds = (1 - default_rate) / default_rate\n", - " \n", - " Higher scores should correspond to higher odds of being good.\n", - " \"\"\"\n", - " df = dataset.df\n", - " \n", - " # Create score bands\n", - " df['score_band'] = pd.cut(\n", - " df[score_column],\n", - " bins=[-np.inf] + score_bands + [np.inf],\n", - " labels=[f'<{score_bands[0]}'] + \n", - " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", - " [f'>{score_bands[-1]}']\n", - " )\n", - " \n", - " # Calculate metrics per band\n", - " results = df.groupby('score_band').agg({\n", - " dataset.target_column: ['mean', 'count']\n", - " })\n", - " \n", - " results.columns = ['Default Rate', 'Total']\n", - " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", - " results['Bad Count'] = results['Default Rate'] * results['Total']\n", - " results['Odds'] = results['Good Count'] / results['Bad Count']\n", - " \n", - " # Create visualization\n", - " fig = go.Figure()\n", - " \n", - " # Add odds bars\n", - " fig.add_trace(go.Bar(\n", - " name='Odds (Good:Bad)',\n", - " x=results.index,\n", - " y=results['Odds'],\n", - " marker_color='blue'\n", - " ))\n", - " \n", - " fig.update_layout(\n", - " title='Score-to-Odds Analysis',\n", - " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", - " showlegend=False\n", - " )\n", - " \n", - " return fig" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"my_custom_tests.ScoreToOdds\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Local test provider\n", - "\n", - "The ValidMind Library offers the ability to extend the built-in library of tests with custom tests. A test \"Provider\" is a Python class that gets registered with the ValidMind Library and loads tests based on a test ID, for example `my_test_provider.my_test_id`. The built-in suite of tests that ValidMind offers is technically its own test provider. You can use one the built-in test provider offered by ValidMind (`validmind.tests.test_providers.LocalTestProvider`) or you can create your own. More than likely, you'll want to use the `LocalTestProvider` to add a directory of custom tests but there's flexibility to be able to load tests from any source." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import LocalTestProvider\n", - "\n", - "# Define the folder where your tests are located\n", - "tests_folder = \"custom_tests\"\n", - "\n", - "# initialize the test provider with the tests folder we created earlier\n", - "my_test_provider = LocalTestProvider(tests_folder)\n", - "\n", - "vm.tests.register_test_provider(\n", - " namespace=\"my_test_provider\",\n", - " test_provider=my_test_provider,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we have our test provider set up, we can run any test that's located in our tests folder by using the `run_test()` method. This function is your entry point to running single tests in the ValidMind Library. It takes a test ID and runs the test associated with that ID. For our custom tests, the test ID will be the `namespace` specified when registering the provider, followed by the path to the test file relative to the tests folder. For example, the Confusion Matrix test we created earlier will have the test ID `my_test_provider.ConfusionMatrix`. You could organize the tests in subfolders, say `classification` and `regression`, and the test ID for the Confusion Matrix test would then be `my_test_provider.classification.ConfusionMatrix`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"my_test_provider.ScoreBandDiscriminationMetrics\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " }\n", - ").log(section_id=\"interpretability_insights\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-245d3f2bfcad480aa6baa2bde87c76e6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + " - [Feature engineering](#toc3_2__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Document the model](#toc5__) \n", + " - [Initialize the ValidMind datasets](#toc5_1__) \n", + " - [Initialize the ValidMind models](#toc5_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", + " - [Compute credit risk scores](#toc5_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", + " - [Raw data](#toc5_6__) \n", + " - [Pre-processed data](#toc5_7__) \n", + " - [Development data](#toc5_8__) \n", + " - [Feature selection](#toc5_9__) \n", + " - [Model training](#toc5_10__) \n", + " - [Model selection](#toc5_11__) \n", + " - [Class discrimination](#toc5_12__) \n", + " - [Classification accuracy](#toc5_13__) \n", + " - [Model diagnosis](#toc5_14__) \n", + " - [Model explainability](#toc5_15__) \n", + " - [Scoring evaluation](#toc5_16__) \n", + "- [Custom tests](#toc6__) \n", + " - [In-line custom tests](#toc6_1__) \n", + " - [Local test provider](#toc6_2__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "from validmind.tests import run_test\n", + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", + "\n", + "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Feature engineering\n", + "\n", + "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", + "\n", + "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the Random Forest model\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50, \n", + " random_state=42,\n", + ")\n", + "\n", + "# Fit the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")\n", + "\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# XGBoost\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Random Forest\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_6__'></a>\n", + "\n", + "### Raw data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DatasetDescription:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ClassImbalance:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.Duplicates:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighCardinality:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"num_threshold\": 100,\n", + " \"percent_threshold\": 0.1,\n", + " \"threshold_type\": \"percent\"\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.Skewness:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"max_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TooManyZeroValues:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"max_percent_threshold\": 0.03\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.IQROutliersTable:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"threshold\": 5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_7__'></a>\n", + "\n", + "### Pre-processed data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularDescriptionTables:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset,\n", + " },\n", + " params={\n", + " \"min_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TargetRateBarPlots:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + " params={\n", + " \"default_column\": lending_club.target_column,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_8__'></a>\n", + "\n", + "### Development data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularDescriptionTables:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ClassImbalance:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_9__'></a>\n", + "\n", + "### Feature selection" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MutualInformation:development_data\",\n", + " input_grid ={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.01,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.3,\n", + " \"top_n_correlations\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.WOEBinTable\",\n", + " input_grid={\n", + " \"dataset\": [vm_preprocess_dataset]\n", + " },\n", + " params={\n", + " \"breaks_adj\": lending_club.breaks_adj,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.WOEBinPlots\",\n", + " input_grid={\n", + " \"dataset\": [vm_preprocess_dataset]\n", + " },\n", + " params={\n", + " \"breaks_adj\": lending_club.breaks_adj,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_10__'></a>\n", + "\n", + "### Model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ModelParameters\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_11__'></a>\n", + "\n", + "### Model selection" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation:RandomForest\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_rf_model,\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.HyperParametersTuning\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_train_ds,\n", + " },\n", + " params={\n", + " \"param_grid\": {'n_estimators': [50, 100]},\n", + " \"scoring\": ['roc_auc', 'recall'],\n", + " \"fit_params\": {'eval_set': [(x_test, y_test)], 'verbose': False},\n", + " \"thresholds\": [0.3, 0.5],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_12__'></a>\n", + "\n", + "### Class discrimination" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"num_bins\": 10,\n", + " \"mode\": \"fixed\"\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_13__'></a>\n", + "\n", + "### Classification accuracy" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierThresholdOptimization\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_xgb_model\n", + " },\n", + " params={\n", + " \"target_recall\": 0.8 # Find a threshold that achieves a recall of 80%\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.CalibrationCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_14__'></a>\n", + "\n", + "### Model diagnosis" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"scaling_factor_std_dev_list\": [\n", + " 0.1,\n", + " 0.2,\n", + " 0.3,\n", + " 0.4,\n", + " 0.5\n", + " ],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_15__'></a>\n", + "\n", + "### Model explainability" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.FeaturesAUC\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_16__'></a>\n", + "\n", + "### Scoring evaluation" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.ScorecardHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ScoreBandDefaultRates\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params = {\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ScoreProbabilityAlignment\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Custom tests\n", + "\n", + "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", + "\n", + "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### In-line custom tests\n", + "\n", + "The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ScoreToOdds\"`. The function `score_to_odds_analysis` takes three arguments `dataset`, `score_column`, and `score_bands`. This is a `VMDataset` and the rest are parameters that can be passed in." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import plotly.graph_objects as go\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", + "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", + " \"\"\"\n", + " Analyzes the relationship between score bands and odds (good:bad ratio).\n", + " Good odds = (1 - default_rate) / default_rate\n", + " \n", + " Higher scores should correspond to higher odds of being good.\n", + " \"\"\"\n", + " df = dataset.df\n", + " \n", + " # Create score bands\n", + " df['score_band'] = pd.cut(\n", + " df[score_column],\n", + " bins=[-np.inf] + score_bands + [np.inf],\n", + " labels=[f'<{score_bands[0]}'] + \n", + " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", + " [f'>{score_bands[-1]}']\n", + " )\n", + " \n", + " # Calculate metrics per band\n", + " results = df.groupby('score_band').agg({\n", + " dataset.target_column: ['mean', 'count']\n", + " })\n", + " \n", + " results.columns = ['Default Rate', 'Total']\n", + " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", + " results['Bad Count'] = results['Default Rate'] * results['Total']\n", + " results['Odds'] = results['Good Count'] / results['Bad Count']\n", + " \n", + " # Create visualization\n", + " fig = go.Figure()\n", + " \n", + " # Add odds bars\n", + " fig.add_trace(go.Bar(\n", + " name='Odds (Good:Bad)',\n", + " x=results.index,\n", + " y=results['Odds'],\n", + " marker_color='blue'\n", + " ))\n", + " \n", + " fig.update_layout(\n", + " title='Score-to-Odds Analysis',\n", + " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", + " showlegend=False\n", + " )\n", + " \n", + " return fig" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"my_custom_tests.ScoreToOdds\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Local test provider\n", + "\n", + "The ValidMind Library offers the ability to extend the built-in library of tests with custom tests. A test \"Provider\" is a Python class that gets registered with the ValidMind Library and loads tests based on a test ID, for example `my_test_provider.my_test_id`. The built-in suite of tests that ValidMind offers is technically its own test provider. You can use one the built-in test provider offered by ValidMind (`validmind.tests.test_providers.LocalTestProvider`) or you can create your own. More than likely, you'll want to use the `LocalTestProvider` to add a directory of custom tests but there's flexibility to be able to load tests from any source." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import LocalTestProvider\n", + "\n", + "# Define the folder where your tests are located\n", + "tests_folder = \"custom_tests\"\n", + "\n", + "# initialize the test provider with the tests folder we created earlier\n", + "my_test_provider = LocalTestProvider(tests_folder)\n", + "\n", + "vm.tests.register_test_provider(\n", + " namespace=\"my_test_provider\",\n", + " test_provider=my_test_provider,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have our test provider set up, we can run any test that's located in our tests folder by using the `run_test()` method. This function is your entry point to running single tests in the ValidMind Library. It takes a test ID and runs the test associated with that ID. For our custom tests, the test ID will be the `namespace` specified when registering the provider, followed by the path to the test file relative to the tests folder. For example, the Confusion Matrix test we created earlier will have the test ID `my_test_provider.ConfusionMatrix`. You could organize the tests in subfolders, say `classification` and `regression`, and the test ID for the Confusion Matrix test would then be `my_test_provider.classification.ConfusionMatrix`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"my_test_provider.ScoreBandDiscriminationMetrics\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " }\n", + ").log(section_id=\"interpretability_insights\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-245d3f2bfcad480aa6baa2bde87c76e6" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 8949b55a5..c8854999b 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -1,1016 +1,1022 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an Excel-based application scorecard model\n", - "\n", - "Build and document an Excel-based application scorecard model with the ValidMind Library. Learn how to load an Excel-based model, prepare your datasets and model for testing, run tests and log those test results to the ValidMind Platform.\n", - "\n", - "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", - "\n", - " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", - " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Loading the sample datasets](#toc3__) \n", - " - [Load the raw dataset](#toc3_1__) \n", - " - [Load the preprocessed dataset](#toc3_2__) \n", - " - [Load the training and test datasets](#toc3_3__) \n", - "- [Initialize the ValidMind datasets](#toc4__) \n", - "- [Initialize the ValidMind model](#toc5__) \n", - " - [Link predictions](#toc5_1__) \n", - "- [Running tests](#toc6__) \n", - " - [Enable custom context for test descriptions](#toc6_1__) \n", - " - [Define tests to run](#toc6_2__) \n", - " - [Run defined tests](#toc6_3__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Add individual test results to documentation](#toc7_1_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2); border-radius: 5px;\">\n", - " <span style=\"color: #083E44;\"><b>Recommended Python versions</b></span><br />\n", - " Python 3.8 ≤ x ≤ 3.11\n", - "</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Install **OpenPyPL** (openpyxl) which will allow us to read and write `.xlsx` files.\n", - "- Import `pandas`, a Python library for data manipulation and analytics, as an alias.\n", - "- Enable `matplotlib`, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install openpyxl\n", - "\n", - "import pandas as pd\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Loading the sample datasets\n", - "\n", - "Let's import our sample dataset in the form of an Excel workbook ([CreditRiskData.xlsx](CreditRiskData.xlsx)) with five sheets indexed 0 to 3, each representing a different stage of data preparation:\n", - "\n", - "0. **Raw Data** – The original unprocessed dataset.\n", - "1. **Preprocessed Data** – A cleaned and prepared version of the raw data.\n", - "2. **Train Data** – A training subset used to fit your model.\n", - "3. **Test Data** – A testing subset used to evaluate model performance." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Load the raw dataset\n", - "\n", - "We'll start by loading the **Raw Data** sheet (index `0`) into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = pd.read_excel('CreditRiskData.xlsx', sheet_name=0,engine='openpyxl')\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Load the preprocessed dataset\n", - "\n", - "Next, load the **Preprocessed Data** sheet (index `1`), containing cleaned inputs ready for scoring:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=1,engine='openpyxl')\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Load the training and test datasets\n", - "\n", - "Finally, load the split training (**Train Data**, index `2`) and testing (**Test Data**, index `3`) sets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=2,engine='openpyxl')\n", - "test_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=3,engine='openpyxl')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your loaded datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`:** The input DataFrame to test.\n", - "- **`input_id`:** A unique identifier for tracking test inputs.\n", - "- **`target_column`:** Required for tests that compare predictions to actual outcomes; specify the name of the column with the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the preprocessed dataset\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column='loan_status',\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Initialize the ValidMind model\n", - "\n", - "In this Excel-based use case, predictions are precomputed and included in the Excel file. While there's no model logic to run, a ValidMind model object (`vm_model`) is still required for passing to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Prediction logic placeholder\n", - "def dummy(X, **kwargs):\n", - " return None\n", - "\n", - "xgb_model = vm.init_model(\n", - " input_id=\"xgb_model\",\n", - " predict_fn=dummy\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Link predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "Use the [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object to link the prediction values and probabilities from the relevant columns on our Excel spreadsheet to the training and testing datasets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')\n", - "vm_test_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running tests\n", - "\n", - "This is where it all comes together — we'll use our previously initialized datasets as inputs to run tests, then log the results to the ValidMind Platform.\n", - "\n", - "We'll run some tests that are defined out-of-the-box by the template we previewed earlier in this notebook, as well as some additional tests for more evidence. For the example in this section, we've selected and defined the tests for you.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", - "<br></br>\n", - "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Enable custom context for test descriptions\n", - "\n", - "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", - "\n", - "Before we run our tests, we'll include some custom use case context to improve the clarity, structure, and interpretability of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", - "\n", - "This is a global setting that will affect all tests for your linked model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Define tests to run\n", - "\n", - "First, we'll specify all the tests we'd like to independently run in a dictionary called `test_config`, including information about the `params` and `inputs` that each test requires.\n", - "\n", - "- Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**.\n", - "- When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. (Example: `:raw_data` for tests run with our raw dataset.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = {\n", - "\n", - " # Data validation tests run with raw dataset\n", - " 'validmind.data_validation.DatasetDescription:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.Duplicates:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.HighCardinality:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {\n", - " 'num_threshold': 100,\n", - " 'percent_threshold': 0.1,\n", - " 'threshold_type': 'percent'\n", - " }\n", - " },\n", - " 'validmind.data_validation.Skewness:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_percent_threshold': 0.03}\n", - " },\n", - " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'threshold': 5}\n", - " },\n", - "\n", - " # Data validation tests run with preprocessed dataset\n", - " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'default_column': 'loan_status'}\n", - " },\n", - "\n", - " 'validmind.data_validation.WOEBinTable': {\n", - " 'input_grid': {'dataset': ['preprocess_dataset']},\n", - " 'params': {\n", - " 'breaks_adj': {\n", - " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", - " 'int_rate': [10, 15, 20],\n", - " 'annual_inc': [50000, 100000, 150000]\n", - " }\n", - " }\n", - " },\n", - " 'validmind.data_validation.WOEBinPlots': {\n", - " 'input_grid': {'dataset': ['preprocess_dataset']},\n", - " 'params': {\n", - " 'breaks_adj': {\n", - " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", - " 'int_rate': [10, 15, 20],\n", - " 'annual_inc': [50000, 100000, 150000]\n", - " }\n", - " }\n", - " },\n", - "\n", - " # Data validation tests run with training & testing datasets\n", - " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.MutualInformation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_threshold': 0.01}\n", - " },\n", - " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", - " },\n", - " 'validmind.data_validation.ScoreBandDefaultRates:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'score_column': 'xgb_scores', 'score_bands': [504, 537, 570]}\n", - " },\n", - " 'validmind.data_validation.DatasetSplit:development_data': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset']}\n", - " },\n", - "\n", - " # Model validation tests\n", - " 'validmind.model_validation.statsmodels.GINITable': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ClassifierPerformance': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost': {\n", - " 'inputs': {\n", - " 'datasets': ['train_dataset', 'test_dataset'],\n", - " 'model': 'xgb_model'\n", - " },\n", - " 'params': {'max_threshold': 0.1}\n", - " },\n", - " 'validmind.model_validation.sklearn.ROCCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.5}\n", - " },\n", - " 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities': {\n", - " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.model_validation.sklearn.PopulationStabilityIndex': {\n", - " 'inputs': {\n", - " 'datasets': ['train_dataset', 'test_dataset'],\n", - " 'model': 'xgb_model'\n", - " },\n", - " 'params': {'num_bins': 10, 'mode': 'fixed'}\n", - " },\n", - " 'validmind.model_validation.sklearn.ConfusionMatrix': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumAccuracy': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.7}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumF1Score': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.5}\n", - " },\n", - " 'validmind.model_validation.sklearn.PrecisionRecallCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.CalibrationCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ClassifierThresholdOptimization': {\n", - " 'inputs': {'dataset': 'train_dataset', 'model': 'xgb_model'},\n", - " 'params': {'target_recall': 0.8}\n", - " },\n", - " 'validmind.model_validation.statsmodels.ScorecardHistogram': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'score_column': 'xgb_scores'}\n", - " },\n", - " 'validmind.model_validation.sklearn.ScoreProbabilityAlignment': {\n", - " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'score_column': 'xgb_scores'}\n", - " },\n", - " 'validmind.model_validation.sklearn.WeakspotsDiagnosis': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'}\n", - " },\n", - " 'validmind.model_validation.sklearn.OverfitDiagnosis': {\n", - " 'inputs': {'model': 'xgb_model', 'datasets': ['train_dataset', 'test_dataset']},\n", - " 'params': {'cut_off_threshold': 0.04}\n", - " },\n", - " 'validmind.model_validation.sklearn.RobustnessDiagnosis': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'},\n", - " 'params': {\n", - " 'scaling_factor_std_dev_list': [0.1, 0.2, 0.3, 0.4, 0.5],\n", - " 'performance_decay_threshold': 0.05\n", - " }\n", - " },\n", - " 'validmind.model_validation.FeaturesAUC': {\n", - " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", - " }\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run defined tests\n", - "\n", - "Then, we'll define a utility wrapper around [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module in a function called `run_doc_tests`.\n", - "\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "- Our function requires information about the inputs to use on every test — which is why we specified these inputs above in `test_config`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def run_doc_tests(test_config):\n", - " for test_name, test_cfg in test_config.items():\n", - " print(test_name)\n", - " try:\n", - " # Collect available keyword arguments\n", - " kwargs = {\n", - " key: test_cfg[key]\n", - " for key in (\"params\", \"input_grid\", \"inputs\")\n", - " if key in test_cfg\n", - " }\n", - " kwargs[\"show\"] = False\n", - "\n", - " # Execute the test and log the results\n", - " vm.tests.run_test(test_name, **kwargs).log()\n", - "\n", - " except Exception as e:\n", - " print(f\"Error running test {test_name}: {e}\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, we can pass the input configuration to `run_doc_tests` and run the full suite of tests!\n", - "\n", - "The variable `full_suite` then holds the result of these tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = run_doc_tests(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the outputs returned indicating that certain test-driven blocks don't currently exist in your documentation for this particular test ID. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run individual tests not defined by the documentation template out-of-the-box, the results logged need to be manually added to your documentation within the ValidMind Platform.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way, use the ValidMind Platform to work with your documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready.\n", - "\n", - "3. Expand the following section to review tests automatically inserted into your documentation template: **2.3. Feature Selection and Engineering**\n", - "\n", - "<a id='toc7_1_1__'></a>\n", - "\n", - "#### Add individual test results to documentation\n", - "\n", - "Let's also add our additional test results into the documentation. These were results sent by individual tests not defined out-of-the-box by our template. For example (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", - "\n", - "1. Locate the Data Preparation section of your documentation and click on **2.2. Correlations and Interactions** to expand that section.\n", - "\n", - "4. Hover under the Pearson Correlation Matrix content block until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", - "\n", - " <img src= \"../../tutorials/development/add-content-block.gif\" alt=\"Screenshot showing insert block button in model documentation\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>\n", - "\n", - "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", - "\n", - " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", - " - In the search bar, type in `HighPearsonCorrelation`.\n", - " - Select `HighPearsonCorrelation:development_data` as the test.\n", - "\n", - "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", - "\n", - " Confirm that the individual results for the high correlation test has been correctly inserted into section **2.3. Correlations and Interactions** of the documentation.\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-9a4dd2ee254f496292698e9be3d8f799", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an Excel-based application scorecard model\n", + "\n", + "Build and document an Excel-based application scorecard model with the ValidMind Library. Learn how to load an Excel-based model, prepare your datasets and model for testing, run tests and log those test results to the ValidMind Platform.\n", + "\n", + "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", + "\n", + " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", + " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Loading the sample datasets](#toc3__) \n", + " - [Load the raw dataset](#toc3_1__) \n", + " - [Load the preprocessed dataset](#toc3_2__) \n", + " - [Load the training and test datasets](#toc3_3__) \n", + "- [Initialize the ValidMind datasets](#toc4__) \n", + "- [Initialize the ValidMind model](#toc5__) \n", + " - [Link predictions](#toc5_1__) \n", + "- [Running tests](#toc6__) \n", + " - [Enable custom context for test descriptions](#toc6_1__) \n", + " - [Define tests to run](#toc6_2__) \n", + " - [Run defined tests](#toc6_3__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Add individual test results to documentation](#toc7_1_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2); border-radius: 5px;\">\n", + " <span style=\"color: #083E44;\"><b>Recommended Python versions</b></span><br />\n", + " Python 3.8 ≤ x ≤ 3.11\n", + "</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Install **OpenPyPL** (openpyxl) which will allow us to read and write `.xlsx` files.\n", + "- Import `pandas`, a Python library for data manipulation and analytics, as an alias.\n", + "- Enable `matplotlib`, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install openpyxl\n", + "\n", + "import pandas as pd\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Loading the sample datasets\n", + "\n", + "Let's import our sample dataset in the form of an Excel workbook ([CreditRiskData.xlsx](CreditRiskData.xlsx)) with five sheets indexed 0 to 3, each representing a different stage of data preparation:\n", + "\n", + "0. **Raw Data** – The original unprocessed dataset.\n", + "1. **Preprocessed Data** – A cleaned and prepared version of the raw data.\n", + "2. **Train Data** – A training subset used to fit your model.\n", + "3. **Test Data** – A testing subset used to evaluate model performance." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Load the raw dataset\n", + "\n", + "We'll start by loading the **Raw Data** sheet (index `0`) into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = pd.read_excel('CreditRiskData.xlsx', sheet_name=0,engine='openpyxl')\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Load the preprocessed dataset\n", + "\n", + "Next, load the **Preprocessed Data** sheet (index `1`), containing cleaned inputs ready for scoring:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=1,engine='openpyxl')\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Load the training and test datasets\n", + "\n", + "Finally, load the split training (**Train Data**, index `2`) and testing (**Test Data**, index `3`) sets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=2,engine='openpyxl')\n", + "test_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=3,engine='openpyxl')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your loaded datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`:** The input DataFrame to test.\n", + "- **`input_id`:** A unique identifier for tracking test inputs.\n", + "- **`target_column`:** Required for tests that compare predictions to actual outcomes; specify the name of the column with the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the preprocessed dataset\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column='loan_status',\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Initialize the ValidMind model\n", + "\n", + "In this Excel-based use case, predictions are precomputed and included in the Excel file. While there's no model logic to run, a ValidMind model object (`vm_model`) is still required for passing to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Prediction logic placeholder\n", + "def dummy(X, **kwargs):\n", + " return None\n", + "\n", + "xgb_model = vm.init_model(\n", + " input_id=\"xgb_model\",\n", + " predict_fn=dummy\n", + " )" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Link predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "Use the [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object to link the prediction values and probabilities from the relevant columns on our Excel spreadsheet to the training and testing datasets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')\n", + "vm_test_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running tests\n", + "\n", + "This is where it all comes together — we'll use our previously initialized datasets as inputs to run tests, then log the results to the ValidMind Platform.\n", + "\n", + "We'll run some tests that are defined out-of-the-box by the template we previewed earlier in this notebook, as well as some additional tests for more evidence. For the example in this section, we've selected and defined the tests for you.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", + "<br></br>\n", + "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Enable custom context for test descriptions\n", + "\n", + "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", + "\n", + "Before we run our tests, we'll include some custom use case context to improve the clarity, structure, and interpretability of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", + "\n", + "This is a global setting that will affect all tests for your linked model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Define tests to run\n", + "\n", + "First, we'll specify all the tests we'd like to independently run in a dictionary called `test_config`, including information about the `params` and `inputs` that each test requires.\n", + "\n", + "- Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**.\n", + "- When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. (Example: `:raw_data` for tests run with our raw dataset.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = {\n", + "\n", + " # Data validation tests run with raw dataset\n", + " 'validmind.data_validation.DatasetDescription:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.Duplicates:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.HighCardinality:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {\n", + " 'num_threshold': 100,\n", + " 'percent_threshold': 0.1,\n", + " 'threshold_type': 'percent'\n", + " }\n", + " },\n", + " 'validmind.data_validation.Skewness:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_percent_threshold': 0.03}\n", + " },\n", + " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'threshold': 5}\n", + " },\n", + "\n", + " # Data validation tests run with preprocessed dataset\n", + " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'default_column': 'loan_status'}\n", + " },\n", + "\n", + " 'validmind.data_validation.WOEBinTable': {\n", + " 'input_grid': {'dataset': ['preprocess_dataset']},\n", + " 'params': {\n", + " 'breaks_adj': {\n", + " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", + " 'int_rate': [10, 15, 20],\n", + " 'annual_inc': [50000, 100000, 150000]\n", + " }\n", + " }\n", + " },\n", + " 'validmind.data_validation.WOEBinPlots': {\n", + " 'input_grid': {'dataset': ['preprocess_dataset']},\n", + " 'params': {\n", + " 'breaks_adj': {\n", + " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", + " 'int_rate': [10, 15, 20],\n", + " 'annual_inc': [50000, 100000, 150000]\n", + " }\n", + " }\n", + " },\n", + "\n", + " # Data validation tests run with training & testing datasets\n", + " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.MutualInformation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_threshold': 0.01}\n", + " },\n", + " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", + " },\n", + " 'validmind.data_validation.ScoreBandDefaultRates:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'score_column': 'xgb_scores', 'score_bands': [504, 537, 570]}\n", + " },\n", + " 'validmind.data_validation.DatasetSplit:development_data': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset']}\n", + " },\n", + "\n", + " # Model validation tests\n", + " 'validmind.model_validation.statsmodels.GINITable': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ClassifierPerformance': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost': {\n", + " 'inputs': {\n", + " 'datasets': ['train_dataset', 'test_dataset'],\n", + " 'model': 'xgb_model'\n", + " },\n", + " 'params': {'max_threshold': 0.1}\n", + " },\n", + " 'validmind.model_validation.sklearn.ROCCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.5}\n", + " },\n", + " 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities': {\n", + " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.model_validation.sklearn.PopulationStabilityIndex': {\n", + " 'inputs': {\n", + " 'datasets': ['train_dataset', 'test_dataset'],\n", + " 'model': 'xgb_model'\n", + " },\n", + " 'params': {'num_bins': 10, 'mode': 'fixed'}\n", + " },\n", + " 'validmind.model_validation.sklearn.ConfusionMatrix': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumAccuracy': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.7}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumF1Score': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.5}\n", + " },\n", + " 'validmind.model_validation.sklearn.PrecisionRecallCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.CalibrationCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ClassifierThresholdOptimization': {\n", + " 'inputs': {'dataset': 'train_dataset', 'model': 'xgb_model'},\n", + " 'params': {'target_recall': 0.8}\n", + " },\n", + " 'validmind.model_validation.statsmodels.ScorecardHistogram': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'score_column': 'xgb_scores'}\n", + " },\n", + " 'validmind.model_validation.sklearn.ScoreProbabilityAlignment': {\n", + " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'score_column': 'xgb_scores'}\n", + " },\n", + " 'validmind.model_validation.sklearn.WeakspotsDiagnosis': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'}\n", + " },\n", + " 'validmind.model_validation.sklearn.OverfitDiagnosis': {\n", + " 'inputs': {'model': 'xgb_model', 'datasets': ['train_dataset', 'test_dataset']},\n", + " 'params': {'cut_off_threshold': 0.04}\n", + " },\n", + " 'validmind.model_validation.sklearn.RobustnessDiagnosis': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'},\n", + " 'params': {\n", + " 'scaling_factor_std_dev_list': [0.1, 0.2, 0.3, 0.4, 0.5],\n", + " 'performance_decay_threshold': 0.05\n", + " }\n", + " },\n", + " 'validmind.model_validation.FeaturesAUC': {\n", + " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", + " }\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run defined tests\n", + "\n", + "Then, we'll define a utility wrapper around [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module in a function called `run_doc_tests`.\n", + "\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "- Our function requires information about the inputs to use on every test — which is why we specified these inputs above in `test_config`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def run_doc_tests(test_config):\n", + " for test_name, test_cfg in test_config.items():\n", + " print(test_name)\n", + " try:\n", + " # Collect available keyword arguments\n", + " kwargs = {\n", + " key: test_cfg[key]\n", + " for key in (\"params\", \"input_grid\", \"inputs\")\n", + " if key in test_cfg\n", + " }\n", + " kwargs[\"show\"] = False\n", + "\n", + " # Execute the test and log the results\n", + " vm.tests.run_test(test_name, **kwargs).log()\n", + "\n", + " except Exception as e:\n", + " print(f\"Error running test {test_name}: {e}\")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we can pass the input configuration to `run_doc_tests` and run the full suite of tests!\n", + "\n", + "The variable `full_suite` then holds the result of these tests:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = run_doc_tests(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the outputs returned indicating that certain test-driven blocks don't currently exist in your documentation for this particular test ID. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run individual tests not defined by the documentation template out-of-the-box, the results logged need to be manually added to your documentation within the ValidMind Platform.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way, use the ValidMind Platform to work with your documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready.\n", + "\n", + "3. Expand the following section to review tests automatically inserted into your documentation template: **2.3. Feature Selection and Engineering**\n", + "\n", + "<a id='toc7_1_1__'></a>\n", + "\n", + "#### Add individual test results to documentation\n", + "\n", + "Let's also add our additional test results into the documentation. These were results sent by individual tests not defined out-of-the-box by our template. For example (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", + "\n", + "1. Locate the Data Preparation section of your documentation and click on **2.2. Correlations and Interactions** to expand that section.\n", + "\n", + "4. Hover under the Pearson Correlation Matrix content block until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", + "\n", + " <img src= \"../../tutorials/development/add-content-block.gif\" alt=\"Screenshot showing insert block button in model documentation\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>\n", + "\n", + "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", + "\n", + " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", + " - In the search bar, type in `HighPearsonCorrelation`.\n", + " - Select `HighPearsonCorrelation:development_data` as the test.\n", + "\n", + "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", + "\n", + " Confirm that the individual results for the high correlation test has been correctly inserted into section **2.3. Correlations and Interactions** of the documentation.\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-9a4dd2ee254f496292698e9be3d8f799" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index 7e2fe0741..a7f7f2256 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -1,560 +1,566 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Prompt validation for large language models (LLMs)\n", - "\n", - "Run and document prompt validation tests for a large language model (LLM) specialized in sentiment analysis for financial news. \n", - "\n", - "This interactive notebook shows you how to set up the ValidMind Library, initialize the library, and use a specific prompt template for analyzing the sentiment of given sentences. Prompt validation covers the initialization of a test dataset and the creation of a foundational model using the ValidMind Library, followed by the execution of a test suite specifically designed for prompt validation. The notebook also includes example data to test the model's ability to correctly identify sentiment as positive, negative, or neutral." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Get ready to run the analysis](#toc3__) \n", - "- [Get your sample dataset ready for analysis](#toc4__) \n", - "- [Perform the prompt validation](#toc5__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your model documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "- [Upgrade ValidMind](#toc7__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `LLM-based Text Classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Get ready to run the analysis\n", - "\n", - "Import the ValidMind `FoundationModel` and `Prompt` classes needed for the sentiment analysis later on:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.models import FoundationModel, Prompt" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Check your access to the OpenAI API:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "\n", - "import dotenv\n", - "\n", - "dotenv.load_dotenv()\n", - "\n", - "if os.getenv(\"OPENAI_API_KEY\") is None:\n", - " raise Exception(\"OPENAI_API_KEY not found\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from openai import OpenAI\n", - "\n", - "model = OpenAI()\n", - "\n", - "\n", - "def call_model(prompt):\n", - " return (\n", - " model.chat.completions.create(\n", - " model=\"gpt-3.5-turbo\",\n", - " messages=[\n", - " {\"role\": \"user\", \"content\": prompt},\n", - " ],\n", - " )\n", - " .choices[0]\n", - " .message.content\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Set the prompt guidelines for the sentiment analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "prompt_template = \"\"\"\n", - "You are an AI with expertise in sentiment analysis, particularly in the context of financial news.\n", - "Your task is to analyze the sentiment of a specific sentence provided below.\n", - "Before proceeding, take a moment to understand the context and nuances of the financial terminology used in the sentence.\n", - "\n", - "Sentence to Analyze:\n", - "```\n", - "{Sentence}\n", - "```\n", - "\n", - "Please respond with the sentiment of the sentence denoted by one of either 'positive', 'negative', or 'neutral'.\n", - "Please respond only with the sentiment enum value. Do not include any other text in your response.\n", - "\n", - "Note: Ensure that your analysis is based on the content of the sentence and not on external information or assumptions.\n", - "\"\"\".strip()\n", - "\n", - "prompt_variables = [\"Sentence\"]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Get your sample dataset ready for analysis\n", - "\n", - "To perform the sentiment analysis for financial news we're going to load a local copy of this dataset: https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news.\n", - "\n", - "This dataset contains two columns, `Sentiment` and `Sentence`. The sentiment can be `negative`, `neutral` or `positive`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "df = pd.read_csv(\"./datasets/sentiments.csv\")\n", - "\n", - "df_test = df[:10].reset_index(drop=True)\n", - "df_test" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Perform the prompt validation\n", - "\n", - "First, use the ValidMind Library to initialize the dataset and model objects necessary for documentation. The ValidMind `predict_fn` function allows the model to be tested and evaluated in a standardized manner:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_ds = vm.init_dataset(\n", - " dataset=df_test,\n", - " input_id=\"test_dataset\",\n", - " text_column=\"Sentence\",\n", - " target_column=\"Sentiment\",\n", - ")\n", - "\n", - "vm_model = vm.init_model(\n", - " model=FoundationModel(\n", - " predict_fn=call_model,\n", - " prompt=Prompt(\n", - " template=prompt_template,\n", - " variables=prompt_variables,\n", - " ),\n", - " ),\n", - " input_id=\"gpt_35_model\",\n", - ")\n", - "\n", - "# Assign model predictions to the test dataset\n", - "vm_test_ds.assign_predictions(vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, use the ValidMind Library to run validation tests on the model. These tests evaluate various aspects of the prompts, including bias, clarity, conciseness, delimitation, negative instruction, and specificity.\n", - "\n", - "Each test is explained in detail, highlighting its purpose, test mechanism, and the importance of the specific aspect being evaluated. The tests are graded on a scale from 1 to 10, with a predetermined threshold, and the explanations for each test include a score, threshold, and a pass/fail determination." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_suite_results = vm.run_test_suite(\n", - " \"prompt_validation\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, most of the tests pass but the test for _conciseness_ needs further attention, as it fails the threshold. This test is designed to evaluate the brevity and succinctness of prompts provided to a large language model (LLM).\n", - "\n", - "The test matters, because a concise prompt strikes a balance between offering clear instructions and eliminating redundant or unnecessary information, ensuring that the LLM receives relevant input without being overwhelmed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. Click and expand the **Model Development** section.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-da0317263ddc4a119cb7b306ac1b39c1", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Prompt validation for large language models (LLMs)\n", + "\n", + "Run and document prompt validation tests for a large language model (LLM) specialized in sentiment analysis for financial news. \n", + "\n", + "This interactive notebook shows you how to set up the ValidMind Library, initialize the library, and use a specific prompt template for analyzing the sentiment of given sentences. Prompt validation covers the initialization of a test dataset and the creation of a foundational model using the ValidMind Library, followed by the execution of a test suite specifically designed for prompt validation. The notebook also includes example data to test the model's ability to correctly identify sentiment as positive, negative, or neutral." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Get ready to run the analysis](#toc3__) \n", + "- [Get your sample dataset ready for analysis](#toc4__) \n", + "- [Perform the prompt validation](#toc5__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your model documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "- [Upgrade ValidMind](#toc7__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `LLM-based Text Classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Get ready to run the analysis\n", + "\n", + "Import the ValidMind `FoundationModel` and `Prompt` classes needed for the sentiment analysis later on:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.models import FoundationModel, Prompt" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Check your access to the OpenAI API:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "\n", + "import dotenv\n", + "\n", + "dotenv.load_dotenv()\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is None:\n", + " raise Exception(\"OPENAI_API_KEY not found\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from openai import OpenAI\n", + "\n", + "model = OpenAI()\n", + "\n", + "\n", + "def call_model(prompt):\n", + " return (\n", + " model.chat.completions.create(\n", + " model=\"gpt-3.5-turbo\",\n", + " messages=[\n", + " {\"role\": \"user\", \"content\": prompt},\n", + " ],\n", + " )\n", + " .choices[0]\n", + " .message.content\n", + " )" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Set the prompt guidelines for the sentiment analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "prompt_template = \"\"\"\n", + "You are an AI with expertise in sentiment analysis, particularly in the context of financial news.\n", + "Your task is to analyze the sentiment of a specific sentence provided below.\n", + "Before proceeding, take a moment to understand the context and nuances of the financial terminology used in the sentence.\n", + "\n", + "Sentence to Analyze:\n", + "```\n", + "{Sentence}\n", + "```\n", + "\n", + "Please respond with the sentiment of the sentence denoted by one of either 'positive', 'negative', or 'neutral'.\n", + "Please respond only with the sentiment enum value. Do not include any other text in your response.\n", + "\n", + "Note: Ensure that your analysis is based on the content of the sentence and not on external information or assumptions.\n", + "\"\"\".strip()\n", + "\n", + "prompt_variables = [\"Sentence\"]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Get your sample dataset ready for analysis\n", + "\n", + "To perform the sentiment analysis for financial news we're going to load a local copy of this dataset: https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news.\n", + "\n", + "This dataset contains two columns, `Sentiment` and `Sentence`. The sentiment can be `negative`, `neutral` or `positive`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "df = pd.read_csv(\"./datasets/sentiments.csv\")\n", + "\n", + "df_test = df[:10].reset_index(drop=True)\n", + "df_test" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Perform the prompt validation\n", + "\n", + "First, use the ValidMind Library to initialize the dataset and model objects necessary for documentation. The ValidMind `predict_fn` function allows the model to be tested and evaluated in a standardized manner:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_ds = vm.init_dataset(\n", + " dataset=df_test,\n", + " input_id=\"test_dataset\",\n", + " text_column=\"Sentence\",\n", + " target_column=\"Sentiment\",\n", + ")\n", + "\n", + "vm_model = vm.init_model(\n", + " model=FoundationModel(\n", + " predict_fn=call_model,\n", + " prompt=Prompt(\n", + " template=prompt_template,\n", + " variables=prompt_variables,\n", + " ),\n", + " ),\n", + " input_id=\"gpt_35_model\",\n", + ")\n", + "\n", + "# Assign model predictions to the test dataset\n", + "vm_test_ds.assign_predictions(vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, use the ValidMind Library to run validation tests on the model. These tests evaluate various aspects of the prompts, including bias, clarity, conciseness, delimitation, negative instruction, and specificity.\n", + "\n", + "Each test is explained in detail, highlighting its purpose, test mechanism, and the importance of the specific aspect being evaluated. The tests are graded on a scale from 1 to 10, with a predetermined threshold, and the explanations for each test include a score, threshold, and a pass/fail determination." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_suite_results = vm.run_test_suite(\n", + " \"prompt_validation\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, most of the tests pass but the test for _conciseness_ needs further attention, as it fails the threshold. This test is designed to evaluate the brevity and succinctness of prompts provided to a large language model (LLM).\n", + "\n", + "The test matters, because a concise prompt strikes a balance between offering clear instructions and eliminating redundant or unnecessary information, ensuring that the LLM receives relevant input without being overwhelmed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. Click and expand the **Model Development** section.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-da0317263ddc4a119cb7b306ac1b39c1" + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 29e07e6e9..5aa9cee23 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -1,761 +1,765 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a time series forecasting model\n", - "\n", - "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", - "\n", - "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple time series model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - " - [Prepocess the raw dataset](#toc4_1__) \n", - " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Initialize the ValidMind models](#toc4_4__) \n", - " - [Assign predictions to the datasets](#toc4_5__) \n", - " - [Run the full suite of tests](#toc4_6__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestRegressor\n", - "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", - "from sklearn.metrics import mean_squared_error, r2_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.regression import fred_timeseries \n", - "\n", - "target_column = fred_timeseries.target_column\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", - ")\n", - "\n", - "raw_df = fred_timeseries.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", - "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", - "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the raw dataset into training and test sets \n", - "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", - "\n", - "# Take the first difference of the training and test sets\n", - "train_diff_df = train_df.diff().dropna()\n", - "test_diff_df = test_df.diff().dropna()\n", - "\n", - "# Extract the features and target variable from the training set\n", - "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", - "y_diff_train = train_diff_df[target_column]\n", - "\n", - "# Extract the features and target variable from the test set\n", - "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", - "y_diff_test = test_diff_df[target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Train random forests and gradient boosting regressor models\n", - "\n", - "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", - "\n", - "The following helper functions are used to post-process predictions and evaluate model performance:\n", - "\n", - "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", - "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def transform_to_levels(y_diff_pred, first_value=0): \n", - " y_pred = [first_value]\n", - " for pred in y_diff_pred:\n", - " y_pred.append(y_pred[-1] + pred)\n", - " return y_pred\n", - "\n", - "def evaluate_model(y_true, y_pred):\n", - " mse = mean_squared_error(y_true, y_pred)\n", - " r2 = r2_score(y_true, y_pred)\n", - " return mse, r2" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the random forest model\n", - "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", - "model_rf.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_rf.predict(X_diff_train)\n", - "y_diff_test_pred = model_rf.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the gradient boost model\n", - "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", - "model_gb.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_gb.predict(X_diff_train)\n", - "y_diff_test_pred = model_gb.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", - "\n", - "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", - "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", - "- `vm_test_ds`: includes the test data split from the raw dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_ds\",\n", - " dataset=raw_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_diff_ds = vm.init_dataset(\n", - " input_id=\"train_diff_ds\",\n", - " dataset=train_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_diff_ds = vm.init_dataset(\n", - " input_id=\"test_diff_ds\",\n", - " dataset=test_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_ds\",\n", - " dataset=train_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_ds\",\n", - " dataset=test_df,\n", - " target_column=target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forests_model\",\n", - ")\n", - "\n", - "vm_model_gb = vm.init_model(\n", - " model_gb,\n", - " input_id=\"gradient_boosting_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_train_rf_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_test_rf_pred,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_train_gb_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_test_gb_pred,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Run the full suite of tests\n", - "\n", - "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", - "\n", - "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - "```python\n", - "config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - "}\n", - "```\n", - "\n", - "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = fred_timeseries.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. Click and expand the **Model Development** section.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-ab56373aa7ee4e15909017ab135ceaae", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a time series forecasting model\n", + "\n", + "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", + "\n", + "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple time series model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + " - [Prepocess the raw dataset](#toc4_1__) \n", + " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Initialize the ValidMind models](#toc4_4__) \n", + " - [Assign predictions to the datasets](#toc4_5__) \n", + " - [Run the full suite of tests](#toc4_6__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestRegressor\n", + "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", + "from sklearn.metrics import mean_squared_error, r2_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.regression import fred_timeseries \n", + "\n", + "target_column = fred_timeseries.target_column\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", + ")\n", + "\n", + "raw_df = fred_timeseries.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", + "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", + "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the raw dataset into training and test sets \n", + "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", + "\n", + "# Take the first difference of the training and test sets\n", + "train_diff_df = train_df.diff().dropna()\n", + "test_diff_df = test_df.diff().dropna()\n", + "\n", + "# Extract the features and target variable from the training set\n", + "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", + "y_diff_train = train_diff_df[target_column]\n", + "\n", + "# Extract the features and target variable from the test set\n", + "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", + "y_diff_test = test_diff_df[target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Train random forests and gradient boosting regressor models\n", + "\n", + "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", + "\n", + "The following helper functions are used to post-process predictions and evaluate model performance:\n", + "\n", + "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", + "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def transform_to_levels(y_diff_pred, first_value=0): \n", + " y_pred = [first_value]\n", + " for pred in y_diff_pred:\n", + " y_pred.append(y_pred[-1] + pred)\n", + " return y_pred\n", + "\n", + "def evaluate_model(y_true, y_pred):\n", + " mse = mean_squared_error(y_true, y_pred)\n", + " r2 = r2_score(y_true, y_pred)\n", + " return mse, r2" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the random forest model\n", + "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", + "model_rf.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_rf.predict(X_diff_train)\n", + "y_diff_test_pred = model_rf.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the gradient boost model\n", + "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", + "model_gb.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_gb.predict(X_diff_train)\n", + "y_diff_test_pred = model_gb.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", + "\n", + "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", + "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", + "- `vm_test_ds`: includes the test data split from the raw dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_ds\",\n", + " dataset=raw_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_diff_ds = vm.init_dataset(\n", + " input_id=\"train_diff_ds\",\n", + " dataset=train_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_diff_ds = vm.init_dataset(\n", + " input_id=\"test_diff_ds\",\n", + " dataset=test_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_ds\",\n", + " dataset=train_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_ds\",\n", + " dataset=test_df,\n", + " target_column=target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forests_model\",\n", + ")\n", + "\n", + "vm_model_gb = vm.init_model(\n", + " model_gb,\n", + " input_id=\"gradient_boosting_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_train_rf_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_test_rf_pred,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_train_gb_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_test_gb_pred,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Run the full suite of tests\n", + "\n", + "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", + "\n", + "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + "```python\n", + "config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + "}\n", + "```\n", + "\n", + "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = fred_timeseries.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. Click and expand the **Model Development** section.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-ab56373aa7ee4e15909017ab135ceaae" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index fec02e587..e261808c1 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -1,1019 +1,1023 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a time series forecasting model\n", - "\n", - "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", - "\n", - "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple time series model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - " - [Prepocess the raw dataset](#toc4_1__) \n", - " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Initialize the ValidMind models](#toc4_4__) \n", - " - [Assign predictions to the datasets](#toc4_5__) \n", - " - [Run data validation tests](#toc4_6__) \n", - " - [Run model validation tests](#toc4_7__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Time Series Forecasting with ML`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", - "from sklearn.metrics import mean_squared_error, r2_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.regression import fred_timeseries \n", - "\n", - "target_column = fred_timeseries.target_column\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", - ")\n", - "\n", - "raw_df = fred_timeseries.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", - "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", - "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the raw dataset into training and test sets \n", - "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", - "\n", - "# Take the first difference of the training and test sets\n", - "train_diff_df = train_df.diff().dropna()\n", - "test_diff_df = test_df.diff().dropna()\n", - "\n", - "# Extract the features and target variable from the training set\n", - "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", - "y_diff_train = train_diff_df[target_column]\n", - "\n", - "# Extract the features and target variable from the test set\n", - "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", - "y_diff_test = test_diff_df[target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Train random forests and gradient boosting regressor models\n", - "\n", - "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", - "\n", - "The following helper functions are used to post-process predictions and evaluate model performance:\n", - "\n", - "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", - "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def transform_to_levels(y_diff_pred, first_value=0): \n", - " y_pred = [first_value]\n", - " for pred in y_diff_pred:\n", - " y_pred.append(y_pred[-1] + pred)\n", - " return y_pred\n", - "\n", - "def evaluate_model(y_true, y_pred):\n", - " mse = mean_squared_error(y_true, y_pred)\n", - " r2 = r2_score(y_true, y_pred)\n", - " return mse, r2" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the random forest model\n", - "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", - "model_rf.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_rf.predict(X_diff_train)\n", - "y_diff_test_pred = model_rf.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the gradient boost model\n", - "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", - "model_gb.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_gb.predict(X_diff_train)\n", - "y_diff_test_pred = model_gb.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", - "\n", - "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", - "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", - "- `vm_test_ds`: includes the test data split from the raw dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_ds\",\n", - " dataset=raw_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_diff_ds = vm.init_dataset(\n", - " input_id=\"train_diff_ds\",\n", - " dataset=train_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_diff_ds = vm.init_dataset(\n", - " input_id=\"test_diff_ds\",\n", - " dataset=test_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_ds\",\n", - " dataset=train_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_ds\",\n", - " dataset=test_df,\n", - " target_column=target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forests_model\",\n", - ")\n", - "\n", - "vm_model_gb = vm.init_model(\n", - " model_gb,\n", - " input_id=\"gradient_boosting_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_train_rf_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_test_rf_pred,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_train_gb_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_test_gb_pred,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = fred_timeseries.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Run data validation tests" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesDescription\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesLinePlot\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesMissingValues\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.SeasonalDecompose\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesDescriptiveStatistics\",\n", - " input_grid={\n", - " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesOutliers\",\n", - " input_grid={\n", - " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - " params={\n", - " \"zscore_threshold\": 4\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [ \"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - " params={\n", - " \"nbins\": 100\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Run model validation tests" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " input_grid={\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RegressionErrors\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RegressionR2Square\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesR2SquareBySegments:train_data\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesR2SquareBySegments:test_data\",\n", - " input_grid={\n", - " \"dataset\": [\"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " },\n", - " params={\n", - " \"segments\":{\n", - " \"start_date\": [\"2012-11-01\",\"2018-02-01\"],\n", - " \"end_date\": [\"2018-01-01\",\"2023-03-01\"]\n", - " }\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesPredictionsPlot\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesPredictionWithCI\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelPredictionResiduals\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.FeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-d549f9055f374ee392fb42facfd75cb9", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a time series forecasting model\n", + "\n", + "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", + "\n", + "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple time series model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + " - [Prepocess the raw dataset](#toc4_1__) \n", + " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Initialize the ValidMind models](#toc4_4__) \n", + " - [Assign predictions to the datasets](#toc4_5__) \n", + " - [Run data validation tests](#toc4_6__) \n", + " - [Run model validation tests](#toc4_7__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Time Series Forecasting with ML`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", + "from sklearn.metrics import mean_squared_error, r2_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.regression import fred_timeseries \n", + "\n", + "target_column = fred_timeseries.target_column\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", + ")\n", + "\n", + "raw_df = fred_timeseries.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", + "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", + "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the raw dataset into training and test sets \n", + "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", + "\n", + "# Take the first difference of the training and test sets\n", + "train_diff_df = train_df.diff().dropna()\n", + "test_diff_df = test_df.diff().dropna()\n", + "\n", + "# Extract the features and target variable from the training set\n", + "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", + "y_diff_train = train_diff_df[target_column]\n", + "\n", + "# Extract the features and target variable from the test set\n", + "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", + "y_diff_test = test_diff_df[target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Train random forests and gradient boosting regressor models\n", + "\n", + "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", + "\n", + "The following helper functions are used to post-process predictions and evaluate model performance:\n", + "\n", + "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", + "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def transform_to_levels(y_diff_pred, first_value=0): \n", + " y_pred = [first_value]\n", + " for pred in y_diff_pred:\n", + " y_pred.append(y_pred[-1] + pred)\n", + " return y_pred\n", + "\n", + "def evaluate_model(y_true, y_pred):\n", + " mse = mean_squared_error(y_true, y_pred)\n", + " r2 = r2_score(y_true, y_pred)\n", + " return mse, r2" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the random forest model\n", + "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", + "model_rf.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_rf.predict(X_diff_train)\n", + "y_diff_test_pred = model_rf.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the gradient boost model\n", + "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", + "model_gb.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_gb.predict(X_diff_train)\n", + "y_diff_test_pred = model_gb.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", + "\n", + "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", + "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", + "- `vm_test_ds`: includes the test data split from the raw dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_ds\",\n", + " dataset=raw_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_diff_ds = vm.init_dataset(\n", + " input_id=\"train_diff_ds\",\n", + " dataset=train_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_diff_ds = vm.init_dataset(\n", + " input_id=\"test_diff_ds\",\n", + " dataset=test_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_ds\",\n", + " dataset=train_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_ds\",\n", + " dataset=test_df,\n", + " target_column=target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forests_model\",\n", + ")\n", + "\n", + "vm_model_gb = vm.init_model(\n", + " model_gb,\n", + " input_id=\"gradient_boosting_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_train_rf_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_test_rf_pred,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_train_gb_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_test_gb_pred,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = fred_timeseries.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Run data validation tests" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesDescription\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesLinePlot\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesMissingValues\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.SeasonalDecompose\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesDescriptiveStatistics\",\n", + " input_grid={\n", + " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesOutliers\",\n", + " input_grid={\n", + " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + " params={\n", + " \"zscore_threshold\": 4\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [ \"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + " params={\n", + " \"nbins\": 100\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Run model validation tests" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " input_grid={\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RegressionErrors\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RegressionR2Square\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesR2SquareBySegments:train_data\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesR2SquareBySegments:test_data\",\n", + " input_grid={\n", + " \"dataset\": [\"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " },\n", + " params={\n", + " \"segments\":{\n", + " \"start_date\": [\"2012-11-01\",\"2018-02-01\"],\n", + " \"end_date\": [\"2018-01-01\",\"2023-03-01\"]\n", + " }\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesPredictionsPlot\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesPredictionWithCI\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelPredictionResiduals\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.FeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-d549f9055f374ee392fb42facfd75cb9" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } From ed3a31a6e8dcf7f1e9bb013d79c2738699da9192 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Thu, 21 May 2026 11:15:38 -0700 Subject: [PATCH 02/13] Replace validator-framing Key concepts across 4 notebooks Replay of 4ae2f0aa on fresh main, accounting for PR #509 file renames. Updates the validator-framing Key concepts block to the new record/model/validation report terminology across 4 validator notebooks (_about-validmind-validators, quickstart_validation, 1-set_up_validmind_for_validation, validate_application_scorecard). Adds the new `artifacts (findings)` term. TOC anchor preserved on validate_application_scorecard.ipynb. Co-authored-by: Cursor <cursoragent@cursor.com> --- .../quickstart/quickstart_validation.ipynb | 2482 +++++------ .../_about-validmind-validators.ipynb | 160 +- .../1-set_up_validmind_for_validation.ipynb | 1050 ++--- .../validate_application_scorecard.ipynb | 3772 +++++++++-------- 4 files changed, 3752 insertions(+), 3712 deletions(-) diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index 129080917..eaf3ce4d9 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -1,1238 +1,1248 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "1a88a895", - "metadata": {}, - "source": [ - "# Quickstart for validation\n", - "\n", - "Learn the basics of using ValidMind to validate records as part of a validation workflow. Set up the ValidMind Library in your environment, and generate a draft of a validation report using ValidMind tests for a binary classification model.\n", - "\n", - "To validate our model with the ValidMind Library, we'll:\n", - "\n", - "1. Import a sample dataset and preprocess it, then split the datasets and initialize them for use with ValidMind\n", - "2. Independently verify data quality tests performed on datasets by model development\n", - "3. Import a champion model for evaluation\n", - "4. Run model evaluation tests with the ValidMind Library, which will send the results of those tests to the ValidMind Platform" - ] - }, - { - "cell_type": "markdown", - "id": "0493b0cb", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Register a sample model](#toc3_1__) \n", - " - [Assign validator credentials](#toc3_1_1__) \n", - " - [Apply validation report template](#toc3_1_2__) \n", - " - [Install the ValidMind Library](#toc3_2__) \n", - " - [Initialize the ValidMind Library](#toc3_3__) \n", - " - [Get your code snippet](#toc3_3_1__) \n", - " - [Initialize the Python environment](#toc3_4__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the validation report template](#toc4_1__) \n", - " - [View validation report in the ValidMind Platform](#toc4_2__) \n", - "- [Working with ValidMind datasets](#toc5__) \n", - " - [Prepare the sample dataset](#toc5_1__) \n", - " - [Load the sample dataset](#toc5_1_1__) \n", - " - [Preprocess the raw dataset](#toc5_1_2__) \n", - " - [Split the dataset](#toc5_1_3__) \n", - " - [Separate features and targets](#toc5_1_4__) \n", - " - [Initialize the ValidMind datasets](#toc5_2__) \n", - "- [Running data quality tests](#toc6__) \n", - " - [Identify qualitative tests](#toc6_1__) \n", - " - [Run an individual data quality test](#toc6_2__) \n", - " - [Run data comparison tests](#toc6_3__) \n", - "- [Working with ValidMind models](#toc7__) \n", - " - [Import the champion model](#toc7_1__) \n", - " - [Initialize the ValidMind model](#toc7_2__) \n", - " - [Assign predictions](#toc7_3__) \n", - "- [Running model evaluation tests](#toc8__) \n", - " - [Run model performance tests](#toc8_1__) \n", - " - [Run diagnostic tests](#toc8_2__) \n", - " - [Run feature importance tests](#toc8_3__) \n", - "- [In summary](#toc9__) \n", - "- [Next steps](#toc10__) \n", - " - [Work with your validation report](#toc10_1__) \n", - " - [Discover more learning resources](#toc10_2__) \n", - "- [Upgrade ValidMind](#toc11__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "717d2a16", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "369d00db", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "id": "72800fc2", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "e2beb1bb", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "78c8388c", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "ec7b4755", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "97d44f44", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "fc3e48e1", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "id": "428260e0", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "7b16c381", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64eb485c", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "bf77550e", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "ae918c6c", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c6ce354", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "f9bc73e9", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis by enabling **`matplotlib`**, a plotting library used for visualizing data.\n", - "\n", - "This ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e53065d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "e0e942dd", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "0361d8bf", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the validation report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for validation. A template predefines sections for your validation report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "be445598", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "4124c3d7", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### View validation report in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Validation** under Documents for your model and note:\n", - "\n", - " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", - " - [x] How the structure of the validation report reflects the previewed template\n", - "\n", - " <img src= \"../tutorials/validation/compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>" - ] - }, - { - "cell_type": "markdown", - "id": "767ea445", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "ae3f832d", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "f91775e8", - "metadata": {}, - "source": [ - "<a id='toc5_1_1__'></a>\n", - "\n", - "#### Load the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle, which was used to develop the dummy champion.\n", - "\n", - "We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the champion to ensure that the model was built correctly. By independently performing steps taken by the development team, we can confirm whether the model was built using appropriate and properly processed data.\n", - "\n", - "In our below example, note that:\n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "73076ee3", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "6ab7fd19", - "metadata": {}, - "source": [ - "<a id='toc5_1_2__'></a>\n", - "\n", - "#### Preprocess the raw dataset\n", - "\n", - "Let's say that thanks to the documentation submitted by the development team (**Learn more:** [Quickstart for documentation](quickstart_documentation.ipynb)), we know that the sample dataset was first preprocessed before being used to train the champion.\n", - "\n", - "During validation, we use the same data processing logic and training procedure to confirm that the champion's results can be reproduced independently, so let's also start by preprocessing our imported dataset to verify that preprocessing was done correctly. This involves splitting the data and separating the features (inputs) from the targets (outputs)." - ] - }, - { - "cell_type": "markdown", - "id": "af660bf4", - "metadata": {}, - "source": [ - "<a id='toc5_1_3__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ee8cfaaf", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "125a39e6", - "metadata": {}, - "source": [ - "<a id='toc5_1_4__'></a>\n", - "\n", - "#### Separate features and targets\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` to hold the input features, and `y_train` to hold the target variable — the values we want the model to predict:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6fe65be5", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "b6674505", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- **`class_labels`** — An optional value to map predicted classes to class labels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ba677dd7", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the validation dataset\n", - "vm_validation_ds = vm.init_dataset(\n", - " dataset=validation_df,\n", - " input_id=\"validation_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "c53c6d35", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running data quality tests\n", - "\n", - "With everything ready to go, let's explore some of ValidMind's available tests to help us assess the quality of our datasets. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." - ] - }, - { - "cell_type": "markdown", - "id": "b6acd486", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify qualitative tests\n", - "\n", - "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85bc2f85", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "id": "9881e58a", - "metadata": {}, - "source": [ - "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "31b31a51", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(\n", - " tags=[\"data_quality\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "d3e27375", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run an individual data quality test\n", - "\n", - "Next, we'll use our previously initialized raw dataset (`vm_raw_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Here, we'll use the [`ClassImbalance` test](https://docs.validmind.ai/tests/data_validation/ClassImbalance.html) as an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dcb9b017", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "f6b7567b", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "97286c0e", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run data comparison tests\n", - "<span id=\"data-comparison\">\n", - "\n", - "We can also use ValidMind to perform comparison tests between our datasets, again logging the results to the ValidMind Platform. Below, we'll perform two sets of comparison tests with a mix of our datasets and the same class imbalance test:\n", - "\n", - "- When running individual tests, you can use a custom **`result_id`** to tag the individual result with a unique identifier, appended to the `test_id` with a `:` separator.\n", - "- We can specify all the tests we'd ike to run in a dictionary called `test_config`, and we'll pass in an **`input_grid`** of individual test inputs to compare. In this case, we'll input our two datasets for comparison. Note here that the `input_grid` expects the `input_id` of the dataset as the value rather than the variable name we specified." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d53edde7", - "metadata": {}, - "outputs": [], - "source": [ - "# Individual test config with inputs specified\n", - "test_config = {\n", - " # Comparison between training and testing datasets to check if class balance is the same in both sets\n", - " \"validmind.data_validation.ClassImbalance:train_vs_validation\": {\n", - " \"input_grid\": {\"dataset\": [\"train_dataset\", \"validation_dataset\"]}\n", - " },\n", - " # Comparison between training and testing datasets to confirm that both sets have similar class distributions\n", - " \"validmind.data_validation.ClassImbalance:train_vs_test\": {\n", - " \"input_grid\": {\"dataset\": [\"train_dataset\", \"test_dataset\"]},\n", - " },\n", - "}" - ] - }, - { - "cell_type": "markdown", - "id": "1f1b796b", - "metadata": {}, - "source": [ - "Then batch run and log our tests in `test_config`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1b97e404", - "metadata": {}, - "outputs": [], - "source": [ - "for t in test_config:\n", - " print(t)\n", - " try:\n", - " # Check if test has input_grid\n", - " if 'input_grid' in test_config[t]:\n", - " # For tests with input_grid, pass the input_grid configuration\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", - " else:\n", - " # Original logic for regular inputs\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", - " except Exception as e:\n", - " print(f\"Error running test {t}: {str(e)}\")" - ] - }, - { - "cell_type": "markdown", - "id": "1ca8c343", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "1fd05953", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Import the champion model\n", - "\n", - "With our raw dataset preprocessed, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgboost_model_champion.pkl](xgboost_model_champion.pkl)**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f18188e", - "metadata": {}, - "outputs": [], - "source": [ - "# Import the champion model\n", - "import joblib\n", - "\n", - "xgboost = joblib.load(\"xgboost_model_champion.pkl\")" - ] - }, - { - "cell_type": "markdown", - "id": "ee26b0b6", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "In addition to the initialized datasets, you'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our champion.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0a799cf2", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the champion XGBoost model\n", - "vm_xgboost = vm.init_model(\n", - " xgboost,\n", - " input_id=\"xgboost_champion\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "823e49c5", - "metadata": {}, - "source": [ - "<a id='toc7_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "71dd8e7b", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_xgboost,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgboost,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "2e29df90", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Running model evaluation tests\n", - "\n", - "With our setup complete, let's run the rest of our validation tests. Since we have already verified the data quality of the dataset used to train our champion, we will now focus on evaluating the model's performance." - ] - }, - { - "cell_type": "markdown", - "id": "fc6af0e0", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Run model performance tests\n", - "\n", - "First, let's run some performance tests. Use [`vm.tests.list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to identify all the model performance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "202792e8", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "id": "011b7c09", - "metadata": {}, - "source": [ - "We'll isolate the specific tests we want to run in `mpt`, and append an identifier for our champion model here to the `result_id` with a `:` separator like we did above in another test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9fc18843", - "metadata": {}, - "outputs": [], - "source": [ - "mpt = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "id": "52096118", - "metadata": {}, - "source": [ - "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:\n", - "\n", - "- The test set serves as a proxy for real-world data, providing an unbiased estimate of model performance since it was not used during training or tuning.\n", - "- The test set also acts as protection against selection bias and model tweaking, giving a final, more unbiased checkpoint." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6866b21c", - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgboost,\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "id": "842707f9", - "metadata": {}, - "source": [ - "<a id='toc8_2__'></a>\n", - "\n", - "### Run diagnostic tests\n", - "\n", - "Next, we want to inspect the robustness and stability of our champion. Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c9b3caa4", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "id": "5295d37b", - "metadata": {}, - "source": [ - "Let’s now assess the model for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent.\n", - "\n", - "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", - "\n", - "- Since the training dataset (`vm_train_ds`) was used to fit the model, we use this set to establish a baseline performance for how well the model performs on data it has already seen.\n", - "- The testing dataset (`vm_test_ds`) was never seen during training, and here simulates real-world generalization, or how well the model performs on new, unseen data. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "82f824f2", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:xgboost_champion\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgboost]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "88db22ed", - "metadata": {}, - "source": [ - "Let's also conduct *robustness* and *stability* tests.\n", - "\n", - "- Robustness evaluates the model’s ability to maintain consistent performance under varying input conditions.\n", - "- Stability assesses whether the model produces consistent outputs across different data subsets or over time.\n", - "\n", - "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b2676197", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:xgboost_champion\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgboost]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "9226c6ea", - "metadata": {}, - "source": [ - "<a id='toc8_3__'></a>\n", - "\n", - "### Run feature importance tests\n", - "\n", - "We also want to verify the relative influence of different input features on our model's predictions. Use `list_tests()` to identify all the feature importance tests for classification and store them in `FI`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c8c26e6", - "metadata": {}, - "outputs": [], - "source": [ - "# Store the feature importance tests\n", - "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", - "FI" - ] - }, - { - "cell_type": "markdown", - "id": "d36a3544", - "metadata": {}, - "source": [ - "We'll only use our testing dataset (`vm_test_ds`) here, to provide a realistic, unseen sample that mimic future or production data, as the training dataset has already influenced our model during learning:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5a49f550", - "metadata": {}, - "outputs": [], - "source": [ - "# Run and log our feature importance tests with the testing dataset\n", - "for test in FI:\n", - " vm.tests.run_test(\n", - " \"\".join((test,':xgboost_champion')),\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\": vm_xgboost\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "id": "293bf4ca", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the validation report template for your model\n", - "- [x] Import a sample dataset and champion model\n", - "- [x] Initialize ValidMind datasets and model objects\n", - "- [x] Assign model predictions to your ValidMind model objects\n", - "- [x] Identify and run various validation tests\n", - "\n", - "In a usual validation workflow, you would wrap up your validation testing by verifying that all the tests provided by the development team were run and reported accurately, and perhaps even propose a challenger, comparing the performance of the challenger with the running champion.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>With ValidMind, you can easily:</b></span>\n", - "<ul>\n", - " <li>Specify all the tests you'd like to independently rerun, just like you did in the step <a href=\"#run-data-comparison-tests\" style=\"color: #DE257E;\">Run data comparision tests</a></li>\n", - " <li>Evaluate the performance of a challenger against the champion, just like you did in the steps under <a href=\"#running-model-evaluation-tests\" style=\"color: #DE257E;\">Running model evaluation tests</a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "b7fe1ed3", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your validation report." - ] - }, - { - "cell_type": "markdown", - "id": "1e30826e", - "metadata": {}, - "source": [ - "<a id='toc10_1__'></a>\n", - "\n", - "### Work with your validation report\n", - "\n", - "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", - "\n", - "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" - ] - }, - { - "cell_type": "markdown", - "id": "8511e2f8", - "metadata": {}, - "source": [ - "<a id='toc10_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for validation, check out our introductory validation series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for validation](https://docs.validmind.ai/developer/validmind-library.html#validation)**\n", - "- **[Validator Fundamentals](https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:q\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "58d2d5da", - "metadata": {}, - "source": [ - "<a id='toc11__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "upgrade-show-c0a446ff-f26f-4ad0-839a-e92927711798", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "7e76ca12", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "6d3e2933", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-2427447e4fe348908b3423e86473bfeb", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for validation\n", + "\n", + "Learn the basics of using ValidMind to validate records as part of a validation workflow. Set up the ValidMind Library in your environment, and generate a draft of a validation report using ValidMind tests for a binary classification model.\n", + "\n", + "To validate our model with the ValidMind Library, we'll:\n", + "\n", + "1. Import a sample dataset and preprocess it, then split the datasets and initialize them for use with ValidMind\n", + "2. Independently verify data quality tests performed on datasets by model development\n", + "3. Import a champion model for evaluation\n", + "4. Run model evaluation tests with the ValidMind Library, which will send the results of those tests to the ValidMind Platform" + ], + "id": "1a88a895" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Register a sample model](#toc3_1__) \n", + " - [Assign validator credentials](#toc3_1_1__) \n", + " - [Apply validation report template](#toc3_1_2__) \n", + " - [Install the ValidMind Library](#toc3_2__) \n", + " - [Initialize the ValidMind Library](#toc3_3__) \n", + " - [Get your code snippet](#toc3_3_1__) \n", + " - [Initialize the Python environment](#toc3_4__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the validation report template](#toc4_1__) \n", + " - [View validation report in the ValidMind Platform](#toc4_2__) \n", + "- [Working with ValidMind datasets](#toc5__) \n", + " - [Prepare the sample dataset](#toc5_1__) \n", + " - [Load the sample dataset](#toc5_1_1__) \n", + " - [Preprocess the raw dataset](#toc5_1_2__) \n", + " - [Split the dataset](#toc5_1_3__) \n", + " - [Separate features and targets](#toc5_1_4__) \n", + " - [Initialize the ValidMind datasets](#toc5_2__) \n", + "- [Running data quality tests](#toc6__) \n", + " - [Identify qualitative tests](#toc6_1__) \n", + " - [Run an individual data quality test](#toc6_2__) \n", + " - [Run data comparison tests](#toc6_3__) \n", + "- [Working with ValidMind models](#toc7__) \n", + " - [Import the champion model](#toc7_1__) \n", + " - [Initialize the ValidMind model](#toc7_2__) \n", + " - [Assign predictions](#toc7_3__) \n", + "- [Running model evaluation tests](#toc8__) \n", + " - [Run model performance tests](#toc8_1__) \n", + " - [Run diagnostic tests](#toc8_2__) \n", + " - [Run feature importance tests](#toc8_3__) \n", + "- [In summary](#toc9__) \n", + "- [Next steps](#toc10__) \n", + " - [Work with your validation report](#toc10_1__) \n", + " - [Discover more learning resources](#toc10_2__) \n", + "- [Upgrade ValidMind](#toc11__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "0493b0cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "717d2a16" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "369d00db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "72800fc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "e2beb1bb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "78c8388c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "ec7b4755" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "97d44f44" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ], + "id": "fc3e48e1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "428260e0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "7b16c381" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "64eb485c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "bf77550e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "ae918c6c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "9c6ce354" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis by enabling **`matplotlib`**, a plotting library used for visualizing data.\n", + "\n", + "This ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window:" + ], + "id": "f9bc73e9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [], + "id": "1e53065d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "e0e942dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the validation report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for validation. A template predefines sections for your validation report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" + ], + "id": "0361d8bf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "be445598" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### View validation report in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Validation** under Documents for your model and note:\n", + "\n", + " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", + " - [x] How the structure of the validation report reflects the previewed template\n", + "\n", + " <img src= \"../tutorials/validation/compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>" + ], + "id": "4124c3d7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "767ea445" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ], + "id": "ae3f832d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_1__'></a>\n", + "\n", + "#### Load the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle, which was used to develop the dummy champion.\n", + "\n", + "We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the champion to ensure that the model was built correctly. By independently performing steps taken by the development team, we can confirm whether the model was built using appropriate and properly processed data.\n", + "\n", + "In our below example, note that:\n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "f91775e8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "73076ee3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_2__'></a>\n", + "\n", + "#### Preprocess the raw dataset\n", + "\n", + "Let's say that thanks to the documentation submitted by the development team (**Learn more:** [Quickstart for documentation](quickstart_documentation.ipynb)), we know that the sample dataset was first preprocessed before being used to train the champion.\n", + "\n", + "During validation, we use the same data processing logic and training procedure to confirm that the champion's results can be reproduced independently, so let's also start by preprocessing our imported dataset to verify that preprocessing was done correctly. This involves splitting the data and separating the features (inputs) from the targets (outputs)." + ], + "id": "6ab7fd19" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_3__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ], + "id": "af660bf4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [], + "id": "ee8cfaaf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_4__'></a>\n", + "\n", + "#### Separate features and targets\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` to hold the input features, and `y_train` to hold the target variable — the values we want the model to predict:" + ], + "id": "125a39e6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "6fe65be5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- **`class_labels`** — An optional value to map predicted classes to class labels." + ], + "id": "b6674505" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the validation dataset\n", + "vm_validation_ds = vm.init_dataset(\n", + " dataset=validation_df,\n", + " input_id=\"validation_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "ba677dd7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running data quality tests\n", + "\n", + "With everything ready to go, let's explore some of ValidMind's available tests to help us assess the quality of our datasets. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." + ], + "id": "c53c6d35" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify qualitative tests\n", + "\n", + "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." + ], + "id": "b6acd486" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tasks_and_tags()" + ], + "execution_count": null, + "outputs": [], + "id": "85bc2f85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" + ], + "id": "9881e58a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(\n", + " tags=[\"data_quality\"], task=\"classification\"\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "31b31a51" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run an individual data quality test\n", + "\n", + "Next, we'll use our previously initialized raw dataset (`vm_raw_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Here, we'll use the [`ClassImbalance` test](https://docs.validmind.ai/tests/data_validation/ClassImbalance.html) as an example:" + ], + "id": "d3e27375" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "dcb9b017" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" + ], + "id": "f6b7567b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run data comparison tests\n", + "<span id=\"data-comparison\">\n", + "\n", + "We can also use ValidMind to perform comparison tests between our datasets, again logging the results to the ValidMind Platform. Below, we'll perform two sets of comparison tests with a mix of our datasets and the same class imbalance test:\n", + "\n", + "- When running individual tests, you can use a custom **`result_id`** to tag the individual result with a unique identifier, appended to the `test_id` with a `:` separator.\n", + "- We can specify all the tests we'd ike to run in a dictionary called `test_config`, and we'll pass in an **`input_grid`** of individual test inputs to compare. In this case, we'll input our two datasets for comparison. Note here that the `input_grid` expects the `input_id` of the dataset as the value rather than the variable name we specified." + ], + "id": "97286c0e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Individual test config with inputs specified\n", + "test_config = {\n", + " # Comparison between training and testing datasets to check if class balance is the same in both sets\n", + " \"validmind.data_validation.ClassImbalance:train_vs_validation\": {\n", + " \"input_grid\": {\"dataset\": [\"train_dataset\", \"validation_dataset\"]}\n", + " },\n", + " # Comparison between training and testing datasets to confirm that both sets have similar class distributions\n", + " \"validmind.data_validation.ClassImbalance:train_vs_test\": {\n", + " \"input_grid\": {\"dataset\": [\"train_dataset\", \"test_dataset\"]},\n", + " },\n", + "}" + ], + "execution_count": null, + "outputs": [], + "id": "d53edde7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then batch run and log our tests in `test_config`:" + ], + "id": "1f1b796b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for t in test_config:\n", + " print(t)\n", + " try:\n", + " # Check if test has input_grid\n", + " if 'input_grid' in test_config[t]:\n", + " # For tests with input_grid, pass the input_grid configuration\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", + " else:\n", + " # Original logic for regular inputs\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", + " except Exception as e:\n", + " print(f\"Error running test {t}: {str(e)}\")" + ], + "execution_count": null, + "outputs": [], + "id": "1b97e404" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Working with ValidMind models" + ], + "id": "1ca8c343" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Import the champion model\n", + "\n", + "With our raw dataset preprocessed, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgboost_model_champion.pkl](xgboost_model_champion.pkl)**" + ], + "id": "1fd05953" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the champion model\n", + "import joblib\n", + "\n", + "xgboost = joblib.load(\"xgboost_model_champion.pkl\")" + ], + "execution_count": null, + "outputs": [], + "id": "7f18188e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "In addition to the initialized datasets, you'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our champion.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "ee26b0b6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the champion XGBoost model\n", + "vm_xgboost = vm.init_model(\n", + " xgboost,\n", + " input_id=\"xgboost_champion\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0a799cf2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "823e49c5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_xgboost,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgboost,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "71dd8e7b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Running model evaluation tests\n", + "\n", + "With our setup complete, let's run the rest of our validation tests. Since we have already verified the data quality of the dataset used to train our champion, we will now focus on evaluating the model's performance." + ], + "id": "2e29df90" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Run model performance tests\n", + "\n", + "First, let's run some performance tests. Use [`vm.tests.list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to identify all the model performance tests for classification:" + ], + "id": "fc6af0e0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [], + "id": "202792e8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll isolate the specific tests we want to run in `mpt`, and append an identifier for our champion model here to the `result_id` with a `:` separator like we did above in another test:" + ], + "id": "011b7c09" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", + "]" + ], + "execution_count": null, + "outputs": [], + "id": "9fc18843" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:\n", + "\n", + "- The test set serves as a proxy for real-world data, providing an unbiased estimate of model performance since it was not used during training or tuning.\n", + "- The test set also acts as protection against selection bias and model tweaking, giving a final, more unbiased checkpoint." + ], + "id": "52096118" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgboost,\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [], + "id": "6866b21c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_2__'></a>\n", + "\n", + "### Run diagnostic tests\n", + "\n", + "Next, we want to inspect the robustness and stability of our champion. Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" + ], + "id": "842707f9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [], + "id": "c9b3caa4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let’s now assess the model for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent.\n", + "\n", + "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", + "\n", + "- Since the training dataset (`vm_train_ds`) was used to fit the model, we use this set to establish a baseline performance for how well the model performs on data it has already seen.\n", + "- The testing dataset (`vm_test_ds`) was never seen during training, and here simulates real-world generalization, or how well the model performs on new, unseen data. " + ], + "id": "5295d37b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:xgboost_champion\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgboost]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "82f824f2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's also conduct *robustness* and *stability* tests.\n", + "\n", + "- Robustness evaluates the model’s ability to maintain consistent performance under varying input conditions.\n", + "- Stability assesses whether the model produces consistent outputs across different data subsets or over time.\n", + "\n", + "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" + ], + "id": "88db22ed" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:xgboost_champion\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgboost]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "b2676197" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_3__'></a>\n", + "\n", + "### Run feature importance tests\n", + "\n", + "We also want to verify the relative influence of different input features on our model's predictions. Use `list_tests()` to identify all the feature importance tests for classification and store them in `FI`:" + ], + "id": "9226c6ea" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Store the feature importance tests\n", + "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", + "FI" + ], + "execution_count": null, + "outputs": [], + "id": "9c8c26e6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll only use our testing dataset (`vm_test_ds`) here, to provide a realistic, unseen sample that mimic future or production data, as the training dataset has already influenced our model during learning:" + ], + "id": "d36a3544" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Run and log our feature importance tests with the testing dataset\n", + "for test in FI:\n", + " vm.tests.run_test(\n", + " \"\".join((test,':xgboost_champion')),\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\": vm_xgboost\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [], + "id": "5a49f550" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the validation report template for your model\n", + "- [x] Import a sample dataset and champion model\n", + "- [x] Initialize ValidMind datasets and model objects\n", + "- [x] Assign model predictions to your ValidMind model objects\n", + "- [x] Identify and run various validation tests\n", + "\n", + "In a usual validation workflow, you would wrap up your validation testing by verifying that all the tests provided by the development team were run and reported accurately, and perhaps even propose a challenger, comparing the performance of the challenger with the running champion.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>With ValidMind, you can easily:</b></span>\n", + "<ul>\n", + " <li>Specify all the tests you'd like to independently rerun, just like you did in the step <a href=\"#run-data-comparison-tests\" style=\"color: #DE257E;\">Run data comparision tests</a></li>\n", + " <li>Evaluate the performance of a challenger against the champion, just like you did in the steps under <a href=\"#running-model-evaluation-tests\" style=\"color: #DE257E;\">Running model evaluation tests</a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "293bf4ca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your validation report." + ], + "id": "b7fe1ed3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10_1__'></a>\n", + "\n", + "### Work with your validation report\n", + "\n", + "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", + "\n", + "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" + ], + "id": "1e30826e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for validation, check out our introductory validation series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for validation](https://docs.validmind.ai/developer/validmind-library.html#validation)**\n", + "- **[Validator Fundamentals](https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:q\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "8511e2f8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc11__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "58d2d5da" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "upgrade-show-c0a446ff-f26f-4ad0-839a-e92927711798" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "7e76ca12" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "6d3e2933" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-2427447e4fe348908b3423e86473bfeb" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index cba7657c3..5b646da58 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -1,78 +1,88 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "about-intro", - "metadata": {}, - "source": [ - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "about-intro" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "about-begin" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "about-signup" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "about-concepts" + } + ], + "metadata": { + "language_info": { + "name": "python" + } }, - { - "cell_type": "markdown", - "id": "about-begin", - "metadata": {}, - "source": [ - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "about-signup", - "metadata": {}, - "source": [ - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "about-concepts", - "metadata": {}, - "source": [ - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - } - ], - "metadata": { - "language_info": { - "name": "python" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index 6f8d378ce..0f95fc0b4 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -1,523 +1,533 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "821a881e", - "metadata": {}, - "source": [ - "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", - "\n", - "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", - "<br></br>\n", - "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Validator Fundamentals</b></a></div>" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", + "\n", + "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", + "<br></br>\n", + "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Validator Fundamentals</b></a></div>" + ], + "id": "821a881e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Register a sample model](#toc3_1__) \n", + " - [Assign validator credentials](#toc3_1_1__) \n", + " - [Apply documentation template](#toc3_1_2__) \n", + " - [Apply validation report template](#toc3_1_3__) \n", + " - [Install the ValidMind Library](#toc3_2__) \n", + " - [Initialize the ValidMind Library](#toc3_3__) \n", + " - [Get your code snippet](#toc3_3_1__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the validation report template](#toc4_1__) \n", + " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the validation process](#toc7_1__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "19ea797c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "d624f88d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "4fb1ef5a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "594f9fd4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "262ed111" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "0eb67fe9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "e0e1cf3d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "609fe59b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ], + "id": "58e552bb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", + "\n", + "We'll need this documentation template later for reference as we draft our validation report:\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "84251589" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_3__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "fdfb5dc5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "f656d0d6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "931d8f7f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "1435fd5b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "b375b341" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d5d87e2d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "331e1c07" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the validation report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" + ], + "id": "f6331a98" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "13d34bbb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### View validation report in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", + "\n", + "3. Click **Validation** under Documents for your model and note:\n", + "\n", + " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", + " - [x] How the structure of the validation report reflects the previewed template\n", + "\n", + " <img src= \"compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>" + ], + "id": "20717133" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." + ], + "id": "f5d0aaab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "de6abc2a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "dce47e40" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "10272aa9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "7a0c3cc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "2dac11d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the validation report template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "174d2c8d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Start the validation process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" + ], + "id": "d8ffdcf7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-5d7a1c159e4840fca79011d1c0380725" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "19ea797c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Register a sample model](#toc3_1__) \n", - " - [Assign validator credentials](#toc3_1_1__) \n", - " - [Apply documentation template](#toc3_1_2__) \n", - " - [Apply validation report template](#toc3_1_3__) \n", - " - [Install the ValidMind Library](#toc3_2__) \n", - " - [Initialize the ValidMind Library](#toc3_3__) \n", - " - [Get your code snippet](#toc3_3_1__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the validation report template](#toc4_1__) \n", - " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the validation process](#toc7_1__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "d624f88d", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "4fb1ef5a", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "id": "594f9fd4", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "262ed111", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "0eb67fe9", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "e0e1cf3d", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "609fe59b", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "58e552bb", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "id": "84251589", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", - "\n", - "We'll need this documentation template later for reference as we draft our validation report:\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "fdfb5dc5", - "metadata": {}, - "source": [ - "<a id='toc3_1_3__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "f656d0d6", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "931d8f7f", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "1435fd5b", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "b375b341", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d5d87e2d", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "331e1c07", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "f6331a98", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the validation report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13d34bbb", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "20717133", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### View validation report in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", - "\n", - "3. Click **Validation** under Documents for your model and note:\n", - "\n", - " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", - " - [x] How the structure of the validation report reflects the previewed template\n", - "\n", - " <img src= \"compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>" - ] - }, - { - "cell_type": "markdown", - "id": "f5d0aaab", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "de6abc2a", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "dce47e40", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "10272aa9", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "7a0c3cc2", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "2dac11d5", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "174d2c8d", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the validation report template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "d8ffdcf7", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Start the validation process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-5d7a1c159e4840fca79011d1c0380725", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index 5eb6fd0b3..cf5216ca3 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -1,1883 +1,1893 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Validate an application scorecard model\n", - "\n", - "Learn how to independently assess an application scorecard model developed using the ValidMind Library as a validator. You'll evaluate the development of the model by conducting thorough testing and analysis, including the use of challenger models to benchmark performance.\n", - "\n", - "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", - "\n", - " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", - " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for:\n", - "\n", - "- Verifying the data quality steps performed by the development team\n", - "- Independently replicating the champion's results and conducting additional tests to assess performance, stability, and robustness\n", - "- Setting up test inputs and challenger models for comparative analysis\n", - "- Running validation tests, analyzing results, and logging artifacts (findings) to ValidMind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Register a sample model](#toc2_1__) \n", - " - [Assign validator credentials](#toc2_1_1__) \n", - " - [Apply validation report template](#toc2_1_2__) \n", - " - [Install the ValidMind Library](#toc2_2__) \n", - " - [Initialize the ValidMind Library](#toc2_3__) \n", - " - [Get your code snippet](#toc2_3_1__) \n", - " - [Importing the champion model](#toc2_4__) \n", - " - [Load the sample dataset](#toc2_5__) \n", - " - [Preprocess the dataset](#toc2_5_1__) \n", - " - [Apply feature engineering to the dataset](#toc2_5_2__) \n", - " - [Split the feature engineered dataset](#toc2_6__) \n", - "- [Developing potential challenger models](#toc3__) \n", - " - [Train potential challenger models](#toc3_1__) \n", - " - [Random forest classification model](#toc3_1_1__) \n", - " - [Logistic regression model](#toc3_1_2__) \n", - " - [Extract predicted probabilities](#toc3_2__) \n", - " - [Compute binary predictions](#toc3_2_1__) \n", - "- [Initializing the ValidMind objects](#toc4__) \n", - " - [Initialize the ValidMind datasets](#toc4_1__) \n", - " - [Initialize the ValidMind models](#toc4_2__) \n", - " - [Assign predictions](#toc4_3__) \n", - " - [Compute credit risk scores](#toc4_4__) \n", - "- [Running data quality tests](#toc5__) \n", - " - [Identify relevant data quality tests](#toc5_1__) \n", - " - [Run and log an individual data quality test](#toc5_2__) \n", - " - [Log multiple data quality tests](#toc5_3__) \n", - " - [Run data quality comparison tests](#toc5_4__) \n", - "- [Running performance tests](#toc6__) \n", - " - [Identify relevant performance tests](#toc6_1__) \n", - " - [Run and log an individual performance test](#toc6_2__) \n", - " - [Log multiple performance tests](#toc6_3__) \n", - " - [Evaluate performance of the champion model](#toc6_4__) \n", - " - [Evaluate performance of challenger models](#toc6_5__) \n", - " - [Enable custom context for test descriptions](#toc6_5_1__) \n", - " - [Run performance comparison tests](#toc6_5_2__) \n", - "- [Adjust a ValidMind test](#toc7__) \n", - "- [Run diagnostic tests](#toc8__) \n", - "- [Run feature importance tests](#toc9__) \n", - "- [Implement a custom test](#toc10__) \n", - "- [Verify test runs](#toc11__) \n", - "- [Next steps](#toc12__) \n", - " - [Work with your validation report](#toc12_1__) \n", - " - [Discover more learning resources](#toc12_2__) \n", - "- [Upgrade ValidMind](#toc13__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1_2__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Importing the champion model\n", - "\n", - "With the ValidMind Library set up and ready to go, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgb_model_champion.pkl](xgb_model_champion.pkl)**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "#Load the saved model\n", - "xgb_model = xgb.XGBClassifier()\n", - "xgb_model.load_model(\"xgb_model_champion.pkl\")\n", - "xgb_model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Ensure that we have to appropriate order in feature names from Champion model and dataset\n", - "cols_when_model_builds = xgb_model.get_booster().feature_names" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5__'></a>\n", - "\n", - "### Load the sample dataset\n", - "\n", - "Let's next import the public [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) dataset from Kaggle, which was used to develop the dummy champion model.\n", - "\n", - "- We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the model to ensure that the model was built correctly.\n", - "- By independently performing steps such as preprocessing and feature engineering, we can confirm whether the model was built using appropriate and properly processed data.\n", - "\n", - "To be able to use the dataset, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "df = lending_club.load_data(source=\"offline\")\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5_1__'></a>\n", - "\n", - "#### Preprocess the dataset\n", - "\n", - "We'll first quickly preprocess the dataset for data quality testing purposes using `lending_club.preprocess`. This function performs the following operations:\n", - "\n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5_2__'></a>\n", - "\n", - "#### Apply feature engineering to the dataset\n", - "\n", - "Feature engineering improves the dataset's structure to better match what our model expects, and ensures that the model performs optimally by leveraging additional insights from raw data.\n", - "\n", - "We'll apply the following transformations using the `ending_club.feature_engineering()` function to optimize the dataset for predictive modeling in our application scorecard:\n", - "\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_6__'></a>\n", - "\n", - "### Split the feature engineered dataset\n", - "\n", - "With our dummy model imported and our independently preprocessed and feature engineered dataset ready to go, let's now **spilt our dataset into train and test** to start the validation testing process.\n", - "\n", - "Splitting our dataset into training and testing is essential for proper validation testing, as this helps assess how well the model generalizes to unseen data:\n", - "\n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`).\n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]\n", - "\n", - "# Now let's apply the order of features from the champion model construction\n", - "x_train = x_train[cols_when_model_builds]\n", - "x_test = x_test[cols_when_model_builds]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cols_use = ['annual_inc_woe',\n", - " 'verification_status_woe',\n", - " 'emp_length_woe',\n", - " 'installment_woe',\n", - " 'term_woe',\n", - " 'home_ownership_woe',\n", - " 'purpose_woe',\n", - " 'open_acc_woe',\n", - " 'total_acc_woe',\n", - " 'int_rate_woe',\n", - " 'sub_grade_woe',\n", - " 'grade_woe','loan_status']\n", - "\n", - "\n", - "train_df = train_df[cols_use]\n", - "test_df = test_df[cols_use]\n", - "test_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Developing potential challenger models" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Train potential challenger models\n", - "\n", - "We're curious how alternate models compare to our champion model, so let's train two challenger models as basis for our testing.\n", - "\n", - "Our selected options below offer decreased complexity in terms of implementation — such as lessened manual preprocessing — which can reduce the amount of risk for implementation. However, model risk is not calculated in isolation from a single factor, but rather in consideration with trade-offs in predictive performance, ease of interpretability, and overall alignment with business objectives." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Random forest classification model\n", - "\n", - "A *random forest classification model* is an ensemble machine learning algorithm that uses multiple decision trees to classify data. In ensemble learning, multiple models are combined to improve prediction accuracy and robustness.\n", - "\n", - "Random forest classification models generally have higher accuracy because they capture complex, non-linear relationships, but as a result they lack transparency in their predictions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the Random Forest Classification model\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "# Create the model instance with 50 decision trees\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50,\n", - " random_state=42,\n", - ")\n", - "\n", - "# Train the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Logistic regression model\n", - "\n", - "A *logistic regression model* is a statistical machine learning algorithm that uses a linear equation (straight-line relationship between variables) and the logistic function (or sigmoid function, which maps any real-valued number to a range between `0` and `1`) to classify data. In statistical modeling, a single equation is used to estimate the probability of an outcome based on input features.\n", - "\n", - "Logistic regression models are simple and interpretable because they provide clear probability estimates and feature coefficients (numerical value that represents the influence of a particular input feature on the model's prediction), but they may struggle with capturing complex, non-linear relationships in the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the Logistic Regression model\n", - "from sklearn.linear_model import LogisticRegression\n", - "\n", - "# Logistic Regression grid params\n", - "log_reg_params = {\n", - " \"penalty\": [\"l1\", \"l2\"],\n", - " \"C\": [0.001, 0.01, 0.1, 1, 10, 100, 1000],\n", - " \"solver\": [\"liblinear\"],\n", - "}\n", - "\n", - "# Grid search for Logistic Regression\n", - "from sklearn.model_selection import GridSearchCV\n", - "\n", - "grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)\n", - "grid_log_reg.fit(x_train, y_train)\n", - "\n", - "# Logistic Regression best estimator\n", - "log_reg = grid_log_reg.best_estimator_\n", - "log_reg" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Extract predicted probabilities\n", - "\n", - "With our challenger models trained, let's extract the predicted probabilities from our three models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Champion — Application scorecard model\n", - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Challenger — Random forest classification model\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Challenger — Logistic regression model\n", - "train_log_prob = log_reg.predict_proba(x_train)[:, 1]\n", - "test_log_prob = log_reg.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Compute binary predictions\n", - "\n", - "Next, we'll convert the probability predictions from our three models into a binary, based on a threshold of `0.3`:\n", - "\n", - "- If the probability is greater than `0.3`, the prediction becomes `1` (positive).\n", - "- Otherwise, it becomes `0` (negative)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "# Champion — Application scorecard model\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "# Challenger — Random forest classification model\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)\n", - "\n", - "# Challenger — Logistic regression model\n", - "train_log_binary_predictions = (train_log_prob > cut_off_threshold).astype(int)\n", - "test_log_binary_predictions = (test_log_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initializing the ValidMind objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you'll need to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the preprocessed dataset\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the feature engineered dataset\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the test dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After initialization, you can pass the ValidMind `Dataset` objects `vm_raw_dataset`, `vm_preprocess_dataset`, `vm_fe_dataset`, `vm_train_ds`, and `vm_test_ds` into any ValidMind tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for each of our three models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the champion application scorecard model\n", - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model_developer_champion\",\n", - ")\n", - "\n", - "# Initialize the challenger random forest classification model\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")\n", - "\n", - "# Initialize the challenger logistic regression model\n", - "vm_log_model = vm.init_model(\n", - " log_reg,\n", - " input_id=\"log_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "With our models registered, we'll move on to assigning both the predictive probabilities coming directly from each model's predictions, and the binary prediction after applying the cutoff threshold described in the Compute binary predictions step above.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset.assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Champion — Application scorecard model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Challenger — Random forest classification model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n", - "\n", - "\n", - "# Challenger — Logistic regression model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_log_model,\n", - " prediction_values=train_log_binary_predictions,\n", - " prediction_probabilities=train_log_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_log_model,\n", - " prediction_values=test_log_binary_predictions,\n", - " prediction_probabilities=test_log_prob,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "Finally, we'll translate model predictions into actionable scores using probability estimates generated by our trained model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Compute the scores\n", - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "train_rf_scores = lending_club.compute_scores(train_rf_prob)\n", - "test_rf_scores = lending_club.compute_scores(test_rf_prob)\n", - "train_log_scores = lending_club.compute_scores(train_log_prob)\n", - "test_log_scores = lending_club.compute_scores(test_log_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)\n", - "vm_train_ds.add_extra_column(\"rf_scores\", train_rf_scores)\n", - "vm_test_ds.add_extra_column(\"rf_scores\", test_rf_scores)\n", - "vm_train_ds.add_extra_column(\"log_scores\", train_log_scores)\n", - "vm_test_ds.add_extra_column(\"log_scores\", test_log_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running data quality tests\n", - "\n", - "With everything ready to go, let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Identify relevant data quality tests\n", - "\n", - "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(\n", - " tags=[\"data_quality\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", - "<br></br>\n", - "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Run and log an individual data quality test\n", - "\n", - "Next, we'll use our previously initialized preprocessed dataset (`vm_preprocess_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Here, we'll use the [`HighPearsonCorrelation` test](https://docs.validmind.ai/tests/data_validation/HighPearsonCorrelation.html) as an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.HighPearsonCorrelation\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Log multiple data quality tests\n", - "\n", - "Now that we understand how to run a test with ValidMind, we want to run all the tests that were returned for our `classification` tasks focusing on `data_quality`.\n", - "\n", - "We'll store the identified tests in `dq` in preparation for batch running these tests and logging their results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dq = vm.tests.list_tests(tags=[\"data_quality\"], task=\"classification\",pretty=False)\n", - "dq" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With our data quality tests stored, let's run our first batch of tests using the same preprocessed dataset (`vm_preprocess_dataset`) and log their results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in dq:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Run data quality comparison tests\n", - "\n", - "Next, let's reuse the tests in `dq` to perform comparison tests between the raw (`vm_raw_dataset`) and preprocessed (`vm_preprocess_dataset`) dataset, again logging the results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in dq:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_raw_dataset,vm_preprocess_dataset]\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running performance tests\n", - "\n", - "We'll also run some performance tests, beginning with independent testing of our champion application scorecard model, then moving on to our potential challenger models." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify relevant performance tests\n", - "\n", - "Use `vm.tests.list_tests()` to this time identify all the model performance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run and log an individual performance test\n", - "\n", - "Before we run our batch of performance tests, we'll use our previously initialized testing dataset (`vm_test_ds`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here (`xgboost_champion`):\n", - "\n", - "Here, we'll use the [`ClassifierPerformance` test](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html) as an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Log multiple performance tests\n", - "\n", - "We only want to run a few other tests that were returned for our `classification` tasks focusing on `model_performance`, so we'll isolate the specific tests we want to batch run in `mpt`:\n", - "\n", - "- `ClassifierPerformance`\n", - "- [`ConfusionMatrix`](https://docs.validmind.ai/tests/model_validation/sklearn/ConfusionMatrix.html)\n", - "- [`MinimumAccuracy`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumAccuracy.html)\n", - "- [`MinimumF1Score`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumF1Score.html)\n", - "- [`ROCCurve`](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html)\n", - "\n", - "Note the custom `result_id`s appended to the `test_id`s for our champion model (`xgboost_champion`):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "mpt = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Evaluate performance of the champion model\n", - "\n", - "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5__'></a>\n", - "\n", - "### Evaluate performance of challenger models\n", - "\n", - "We've now conducted similar tests as the development team for our champion, with the aim of verifying their test results.\n", - "\n", - "Next, let's see how our challenger models compare. We'll use the same batch of tests here as we did in `mpt`, but append a different `result_id` to indicate that these results should be associated with our challenger models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "mpt_chall = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion_vs_challengers\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5_1__'></a>\n", - "\n", - "#### Enable custom context for test descriptions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", - "\n", - "Before we run our next batch of tests, we'll include some custom use case context to focus on comparison testing going forward, improving the relevancy, insight, and format of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", - "\n", - "This is a global setting that will affect all tests for your linked model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Enabling use case context allows you to pass in additional context to the LLM-generated text descriptions within `context`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - "\n", - " The champion model as the basis for comparison is called \"xgb_model_developer_champion\" and emphasis should be on the following:\n", - " - The metrics for the champion model compared against the challenger models\n", - " - Which model potentially outperforms the champion model based on the metrics, this should be highlighted and emphasized\n", - "\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Champion model (xgb_model_developer_champion) is the selection and challenger models are used to challenge the selection\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about setting custom context for LLM-generated test descriptions?</b></span>\n", - "<br></br>\n", - "Refer to our extended walkthrough notebook: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html\" style=\"color: #DE257E;\"><b>Add context to LLM-generated test descriptions\n", - "</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5_2__'></a>\n", - "\n", - "#### Run performance comparison tests\n", - "\n", - "With the use case context set, we'll run each test in `mpt_chall` once for each model with the same `vm_test_ds` dataset to compare them:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt_chall:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model,vm_rf_model]\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Based on the performance metrics, we can conclude that the random forest classification model is not a viable candidate for our use case and can be disregarded in our tests going forward.</b></span>\n", - "<br></br>\n", - "In the next section, we'll dive a bit deeper into some tests comparing our champion application scorecard model and our remaining challenger logistic regression model, including tests that will allow us to customize parameters and thresholds for performance standards.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Adjust a ValidMind test\n", - "\n", - "Let's dig deeper into the `MinimumF1Score` test we ran previously in Run performance tests to ensure that the models maintain a minimum acceptable balance between *precision* and *recall*. Precision refers to how many out of the positive predictions made by the model were actually correct, and recall refers to how many out of the actual positive cases did the model correctly identify.\n", - "\n", - "Use `run_test()` with our testing dataset (`vm_test_ds`) to run the test in isolation again for our two remaining models without logging the result to have the output to compare with a subsequent iteration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_log_model]\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As `MinimumF1Score` allows us to customize parameters and thresholds for performance standards, let's adjust the threshold to see if it improves metrics:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:AdjThreshold\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_log_model],\n", - " \"params\": {\"min_threshold\": 0.35}\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run diagnostic tests\n", - "\n", - "Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n", - "\n", - "Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's see if models suffer from any *overfit* potentials and also where there are potential sub-segments of issues with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html). \n", - "\n", - "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:Champion_vs_LogRegression\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgb_model,vm_log_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html).\n", - "\n", - "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:Champion_vs_LogRegression\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgb_model,vm_log_model]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Run feature importance tests\n", - "\n", - "We also want to verify the relative influence of different input features on our models' predictions, as well as inspect the differences between our champion and challenger model to see if a certain model offers more understandable or logical importance scores for features.\n", - "\n", - "Use `list_tests()` to identify all the feature importance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Store the feature importance tests\n", - "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", - "FI" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Run and log our feature importance tests for both models for the testing dataset\n", - "for test in FI:\n", - " vm.tests.run_test(\n", - " \"\".join((test,':Champion_vs_LogisticRegression')),\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model]\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Implement a custom test\n", - "\n", - "Let's finish up testing by implementing a custom *inline test* that outputs a FICO score-type score. An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", - "\n", - "The [`@vm.test` wrapper](https://docs.validmind.ai/validmind/validmind.html#test) allows you to create a reusable test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "import plotly.graph_objects as go\n", - "\n", - "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", - "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", - " \"\"\"\n", - " Analyzes the relationship between score bands and odds (good:bad ratio).\n", - " Good odds = (1 - default_rate) / default_rate\n", - " \n", - " Higher scores should correspond to higher odds of being good.\n", - "\n", - " If there are multiple scores provided through score_column, this means that there are two different models and the scores reflect each model\n", - "\n", - " If there are more scores provided in the score_column then focus the assessment on the differences between the two scores and indicate through evidence which one is preferred.\n", - " \"\"\"\n", - " df = dataset.df\n", - " \n", - " # Create score bands\n", - " df['score_band'] = pd.cut(\n", - " df[score_column],\n", - " bins=[-np.inf] + score_bands + [np.inf],\n", - " labels=[f'<{score_bands[0]}'] + \n", - " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", - " [f'>{score_bands[-1]}']\n", - " )\n", - " \n", - " # Calculate metrics per band\n", - " results = df.groupby('score_band').agg({\n", - " dataset.target_column: ['mean', 'count']\n", - " })\n", - " \n", - " results.columns = ['Default Rate', 'Total']\n", - " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", - " results['Bad Count'] = results['Default Rate'] * results['Total']\n", - " results['Odds'] = results['Good Count'] / results['Bad Count']\n", - " \n", - " # Create visualization\n", - " fig = go.Figure()\n", - " \n", - " # Add odds bars\n", - " fig.add_trace(go.Bar(\n", - " name='Odds (Good:Bad)',\n", - " x=results.index,\n", - " y=results['Odds'],\n", - " marker_color='blue'\n", - " ))\n", - " \n", - " fig.update_layout(\n", - " title='Score-to-Odds Analysis',\n", - " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", - " showlegend=False\n", - " )\n", - " \n", - " return fig" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With the custom test available, run and log the test for our champion and challenger models with our testing dataset (`vm_test_ds`):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.ScoreToOdds:Champion_vs_Challenger\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - " param_grid={\n", - " \"score_column\": [\"xgb_scores\",\"log_scores\"],\n", - " \"score_bands\": [[500, 540, 570]],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", - "<br></br>\n", - "Refer to our in-depth introduction to custom tests: <a href=\"../../how_to/tests/custom_tests/implement_custom_tests.ipynb\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc11__'></a>\n", - "\n", - "## Verify test runs\n", - "\n", - "Our final task is to verify that all the tests provided by the development team were run and reported accurately. Note the appended `result_ids` to delineate which dataset we ran the test with for the relevant tests.\n", - "\n", - "Here, we'll specify all the tests we'd like to independently rerun in a dictionary called `test_config`. **Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = {\n", - " # Run with the raw dataset\n", - " 'validmind.data_validation.DatasetDescription:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.Duplicates:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.HighCardinality:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {\n", - " 'num_threshold': 100,\n", - " 'percent_threshold': 0.1,\n", - " 'threshold_type': 'percent'\n", - " }\n", - " },\n", - " 'validmind.data_validation.Skewness:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_percent_threshold': 0.03}\n", - " },\n", - " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'threshold': 5}\n", - " },\n", - " # Run with the preprocessed dataset\n", - " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'default_column': 'loan_status'}\n", - " },\n", - " # Run with the training and test datasets\n", - " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.MutualInformation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_threshold': 0.01}\n", - " },\n", - " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", - " },\n", - " 'validmind.model_validation.ModelMetadata': {\n", - " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ModelParameters': {\n", - " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ROCCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']},\n", - " 'params': {'min_threshold': 0.5}\n", - " }\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Then batch run and log our tests in `test_config`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for t in test_config:\n", - " print(t)\n", - " try:\n", - " # Check if test has input_grid\n", - " if 'input_grid' in test_config[t]:\n", - " # For tests with input_grid, pass the input_grid configuration\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", - " else:\n", - " # Original logic for regular inputs\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", - " except Exception as e:\n", - " print(f\"Error running test {t}: {str(e)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12__'></a>\n", - "\n", - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12_1__'></a>\n", - "\n", - "### Work with your validation report\n", - "\n", - "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", - "\n", - "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc13__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-7c52ad62bcf7411eaaa00aefbac6c756", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Validate an application scorecard model\n", + "\n", + "Learn how to independently assess an application scorecard model developed using the ValidMind Library as a validator. You'll evaluate the development of the model by conducting thorough testing and analysis, including the use of challenger models to benchmark performance.\n", + "\n", + "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", + "\n", + " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", + " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for:\n", + "\n", + "- Verifying the data quality steps performed by the development team\n", + "- Independently replicating the champion's results and conducting additional tests to assess performance, stability, and robustness\n", + "- Setting up test inputs and challenger models for comparative analysis\n", + "- Running validation tests, analyzing results, and logging artifacts (findings) to ValidMind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Register a sample model](#toc2_1__) \n", + " - [Assign validator credentials](#toc2_1_1__) \n", + " - [Apply validation report template](#toc2_1_2__) \n", + " - [Install the ValidMind Library](#toc2_2__) \n", + " - [Initialize the ValidMind Library](#toc2_3__) \n", + " - [Get your code snippet](#toc2_3_1__) \n", + " - [Importing the champion model](#toc2_4__) \n", + " - [Load the sample dataset](#toc2_5__) \n", + " - [Preprocess the dataset](#toc2_5_1__) \n", + " - [Apply feature engineering to the dataset](#toc2_5_2__) \n", + " - [Split the feature engineered dataset](#toc2_6__) \n", + "- [Developing potential challenger models](#toc3__) \n", + " - [Train potential challenger models](#toc3_1__) \n", + " - [Random forest classification model](#toc3_1_1__) \n", + " - [Logistic regression model](#toc3_1_2__) \n", + " - [Extract predicted probabilities](#toc3_2__) \n", + " - [Compute binary predictions](#toc3_2_1__) \n", + "- [Initializing the ValidMind objects](#toc4__) \n", + " - [Initialize the ValidMind datasets](#toc4_1__) \n", + " - [Initialize the ValidMind models](#toc4_2__) \n", + " - [Assign predictions](#toc4_3__) \n", + " - [Compute credit risk scores](#toc4_4__) \n", + "- [Running data quality tests](#toc5__) \n", + " - [Identify relevant data quality tests](#toc5_1__) \n", + " - [Run and log an individual data quality test](#toc5_2__) \n", + " - [Log multiple data quality tests](#toc5_3__) \n", + " - [Run data quality comparison tests](#toc5_4__) \n", + "- [Running performance tests](#toc6__) \n", + " - [Identify relevant performance tests](#toc6_1__) \n", + " - [Run and log an individual performance test](#toc6_2__) \n", + " - [Log multiple performance tests](#toc6_3__) \n", + " - [Evaluate performance of the champion model](#toc6_4__) \n", + " - [Evaluate performance of challenger models](#toc6_5__) \n", + " - [Enable custom context for test descriptions](#toc6_5_1__) \n", + " - [Run performance comparison tests](#toc6_5_2__) \n", + "- [Adjust a ValidMind test](#toc7__) \n", + "- [Run diagnostic tests](#toc8__) \n", + "- [Run feature importance tests](#toc9__) \n", + "- [Implement a custom test](#toc10__) \n", + "- [Verify test runs](#toc11__) \n", + "- [Next steps](#toc12__) \n", + " - [Work with your validation report](#toc12_1__) \n", + " - [Discover more learning resources](#toc12_2__) \n", + "- [Upgrade ValidMind](#toc13__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1_2__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Importing the champion model\n", + "\n", + "With the ValidMind Library set up and ready to go, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgb_model_champion.pkl](xgb_model_champion.pkl)**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "#Load the saved model\n", + "xgb_model = xgb.XGBClassifier()\n", + "xgb_model.load_model(\"xgb_model_champion.pkl\")\n", + "xgb_model" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Ensure that we have to appropriate order in feature names from Champion model and dataset\n", + "cols_when_model_builds = xgb_model.get_booster().feature_names" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5__'></a>\n", + "\n", + "### Load the sample dataset\n", + "\n", + "Let's next import the public [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) dataset from Kaggle, which was used to develop the dummy champion model.\n", + "\n", + "- We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the model to ensure that the model was built correctly.\n", + "- By independently performing steps such as preprocessing and feature engineering, we can confirm whether the model was built using appropriate and properly processed data.\n", + "\n", + "To be able to use the dataset, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "df = lending_club.load_data(source=\"offline\")\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5_1__'></a>\n", + "\n", + "#### Preprocess the dataset\n", + "\n", + "We'll first quickly preprocess the dataset for data quality testing purposes using `lending_club.preprocess`. This function performs the following operations:\n", + "\n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5_2__'></a>\n", + "\n", + "#### Apply feature engineering to the dataset\n", + "\n", + "Feature engineering improves the dataset's structure to better match what our model expects, and ensures that the model performs optimally by leveraging additional insights from raw data.\n", + "\n", + "We'll apply the following transformations using the `ending_club.feature_engineering()` function to optimize the dataset for predictive modeling in our application scorecard:\n", + "\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_6__'></a>\n", + "\n", + "### Split the feature engineered dataset\n", + "\n", + "With our dummy model imported and our independently preprocessed and feature engineered dataset ready to go, let's now **spilt our dataset into train and test** to start the validation testing process.\n", + "\n", + "Splitting our dataset into training and testing is essential for proper validation testing, as this helps assess how well the model generalizes to unseen data:\n", + "\n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`).\n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]\n", + "\n", + "# Now let's apply the order of features from the champion model construction\n", + "x_train = x_train[cols_when_model_builds]\n", + "x_test = x_test[cols_when_model_builds]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cols_use = ['annual_inc_woe',\n", + " 'verification_status_woe',\n", + " 'emp_length_woe',\n", + " 'installment_woe',\n", + " 'term_woe',\n", + " 'home_ownership_woe',\n", + " 'purpose_woe',\n", + " 'open_acc_woe',\n", + " 'total_acc_woe',\n", + " 'int_rate_woe',\n", + " 'sub_grade_woe',\n", + " 'grade_woe','loan_status']\n", + "\n", + "\n", + "train_df = train_df[cols_use]\n", + "test_df = test_df[cols_use]\n", + "test_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Developing potential challenger models" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Train potential challenger models\n", + "\n", + "We're curious how alternate models compare to our champion model, so let's train two challenger models as basis for our testing.\n", + "\n", + "Our selected options below offer decreased complexity in terms of implementation — such as lessened manual preprocessing — which can reduce the amount of risk for implementation. However, model risk is not calculated in isolation from a single factor, but rather in consideration with trade-offs in predictive performance, ease of interpretability, and overall alignment with business objectives." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Random forest classification model\n", + "\n", + "A *random forest classification model* is an ensemble machine learning algorithm that uses multiple decision trees to classify data. In ensemble learning, multiple models are combined to improve prediction accuracy and robustness.\n", + "\n", + "Random forest classification models generally have higher accuracy because they capture complex, non-linear relationships, but as a result they lack transparency in their predictions." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the Random Forest Classification model\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "# Create the model instance with 50 decision trees\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50,\n", + " random_state=42,\n", + ")\n", + "\n", + "# Train the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Logistic regression model\n", + "\n", + "A *logistic regression model* is a statistical machine learning algorithm that uses a linear equation (straight-line relationship between variables) and the logistic function (or sigmoid function, which maps any real-valued number to a range between `0` and `1`) to classify data. In statistical modeling, a single equation is used to estimate the probability of an outcome based on input features.\n", + "\n", + "Logistic regression models are simple and interpretable because they provide clear probability estimates and feature coefficients (numerical value that represents the influence of a particular input feature on the model's prediction), but they may struggle with capturing complex, non-linear relationships in the data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the Logistic Regression model\n", + "from sklearn.linear_model import LogisticRegression\n", + "\n", + "# Logistic Regression grid params\n", + "log_reg_params = {\n", + " \"penalty\": [\"l1\", \"l2\"],\n", + " \"C\": [0.001, 0.01, 0.1, 1, 10, 100, 1000],\n", + " \"solver\": [\"liblinear\"],\n", + "}\n", + "\n", + "# Grid search for Logistic Regression\n", + "from sklearn.model_selection import GridSearchCV\n", + "\n", + "grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)\n", + "grid_log_reg.fit(x_train, y_train)\n", + "\n", + "# Logistic Regression best estimator\n", + "log_reg = grid_log_reg.best_estimator_\n", + "log_reg" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Extract predicted probabilities\n", + "\n", + "With our challenger models trained, let's extract the predicted probabilities from our three models:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Champion — Application scorecard model\n", + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Challenger — Random forest classification model\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Challenger — Logistic regression model\n", + "train_log_prob = log_reg.predict_proba(x_train)[:, 1]\n", + "test_log_prob = log_reg.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Compute binary predictions\n", + "\n", + "Next, we'll convert the probability predictions from our three models into a binary, based on a threshold of `0.3`:\n", + "\n", + "- If the probability is greater than `0.3`, the prediction becomes `1` (positive).\n", + "- Otherwise, it becomes `0` (negative)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "# Champion — Application scorecard model\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "# Challenger — Random forest classification model\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)\n", + "\n", + "# Challenger — Logistic regression model\n", + "train_log_binary_predictions = (train_log_prob > cut_off_threshold).astype(int)\n", + "test_log_binary_predictions = (test_log_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initializing the ValidMind objects" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you'll need to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the preprocessed dataset\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the feature engineered dataset\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the test dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After initialization, you can pass the ValidMind `Dataset` objects `vm_raw_dataset`, `vm_preprocess_dataset`, `vm_fe_dataset`, `vm_train_ds`, and `vm_test_ds` into any ValidMind tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for each of our three models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the champion application scorecard model\n", + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model_developer_champion\",\n", + ")\n", + "\n", + "# Initialize the challenger random forest classification model\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")\n", + "\n", + "# Initialize the challenger logistic regression model\n", + "vm_log_model = vm.init_model(\n", + " log_reg,\n", + " input_id=\"log_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "With our models registered, we'll move on to assigning both the predictive probabilities coming directly from each model's predictions, and the binary prediction after applying the cutoff threshold described in the Compute binary predictions step above.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset.assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Champion — Application scorecard model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Challenger — Random forest classification model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n", + "\n", + "\n", + "# Challenger — Logistic regression model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_log_model,\n", + " prediction_values=train_log_binary_predictions,\n", + " prediction_probabilities=train_log_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_log_model,\n", + " prediction_values=test_log_binary_predictions,\n", + " prediction_probabilities=test_log_prob,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "Finally, we'll translate model predictions into actionable scores using probability estimates generated by our trained model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Compute the scores\n", + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "train_rf_scores = lending_club.compute_scores(train_rf_prob)\n", + "test_rf_scores = lending_club.compute_scores(test_rf_prob)\n", + "train_log_scores = lending_club.compute_scores(train_log_prob)\n", + "test_log_scores = lending_club.compute_scores(test_log_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)\n", + "vm_train_ds.add_extra_column(\"rf_scores\", train_rf_scores)\n", + "vm_test_ds.add_extra_column(\"rf_scores\", test_rf_scores)\n", + "vm_train_ds.add_extra_column(\"log_scores\", train_log_scores)\n", + "vm_test_ds.add_extra_column(\"log_scores\", test_log_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running data quality tests\n", + "\n", + "With everything ready to go, let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Identify relevant data quality tests\n", + "\n", + "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tasks_and_tags()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(\n", + " tags=[\"data_quality\"], task=\"classification\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", + "<br></br>\n", + "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Run and log an individual data quality test\n", + "\n", + "Next, we'll use our previously initialized preprocessed dataset (`vm_preprocess_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Here, we'll use the [`HighPearsonCorrelation` test](https://docs.validmind.ai/tests/data_validation/HighPearsonCorrelation.html) as an example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.HighPearsonCorrelation\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Log multiple data quality tests\n", + "\n", + "Now that we understand how to run a test with ValidMind, we want to run all the tests that were returned for our `classification` tasks focusing on `data_quality`.\n", + "\n", + "We'll store the identified tests in `dq` in preparation for batch running these tests and logging their results to the ValidMind Platform:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "dq = vm.tests.list_tests(tags=[\"data_quality\"], task=\"classification\",pretty=False)\n", + "dq" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With our data quality tests stored, let's run our first batch of tests using the same preprocessed dataset (`vm_preprocess_dataset`) and log their results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in dq:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Run data quality comparison tests\n", + "\n", + "Next, let's reuse the tests in `dq` to perform comparison tests between the raw (`vm_raw_dataset`) and preprocessed (`vm_preprocess_dataset`) dataset, again logging the results to the ValidMind Platform:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in dq:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_raw_dataset,vm_preprocess_dataset]\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running performance tests\n", + "\n", + "We'll also run some performance tests, beginning with independent testing of our champion application scorecard model, then moving on to our potential challenger models." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify relevant performance tests\n", + "\n", + "Use `vm.tests.list_tests()` to this time identify all the model performance tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run and log an individual performance test\n", + "\n", + "Before we run our batch of performance tests, we'll use our previously initialized testing dataset (`vm_test_ds`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here (`xgboost_champion`):\n", + "\n", + "Here, we'll use the [`ClassifierPerformance` test](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html) as an example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Log multiple performance tests\n", + "\n", + "We only want to run a few other tests that were returned for our `classification` tasks focusing on `model_performance`, so we'll isolate the specific tests we want to batch run in `mpt`:\n", + "\n", + "- `ClassifierPerformance`\n", + "- [`ConfusionMatrix`](https://docs.validmind.ai/tests/model_validation/sklearn/ConfusionMatrix.html)\n", + "- [`MinimumAccuracy`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumAccuracy.html)\n", + "- [`MinimumF1Score`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumF1Score.html)\n", + "- [`ROCCurve`](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html)\n", + "\n", + "Note the custom `result_id`s appended to the `test_id`s for our champion model (`xgboost_champion`):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", + "]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Evaluate performance of the champion model\n", + "\n", + "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5__'></a>\n", + "\n", + "### Evaluate performance of challenger models\n", + "\n", + "We've now conducted similar tests as the development team for our champion, with the aim of verifying their test results.\n", + "\n", + "Next, let's see how our challenger models compare. We'll use the same batch of tests here as we did in `mpt`, but append a different `result_id` to indicate that these results should be associated with our challenger models:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt_chall = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion_vs_challengers\"\n", + "]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5_1__'></a>\n", + "\n", + "#### Enable custom context for test descriptions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", + "\n", + "Before we run our next batch of tests, we'll include some custom use case context to focus on comparison testing going forward, improving the relevancy, insight, and format of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", + "\n", + "This is a global setting that will affect all tests for your linked model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Enabling use case context allows you to pass in additional context to the LLM-generated text descriptions within `context`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + "\n", + " The champion model as the basis for comparison is called \"xgb_model_developer_champion\" and emphasis should be on the following:\n", + " - The metrics for the champion model compared against the challenger models\n", + " - Which model potentially outperforms the champion model based on the metrics, this should be highlighted and emphasized\n", + "\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Champion model (xgb_model_developer_champion) is the selection and challenger models are used to challenge the selection\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about setting custom context for LLM-generated test descriptions?</b></span>\n", + "<br></br>\n", + "Refer to our extended walkthrough notebook: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html\" style=\"color: #DE257E;\"><b>Add context to LLM-generated test descriptions\n", + "</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5_2__'></a>\n", + "\n", + "#### Run performance comparison tests\n", + "\n", + "With the use case context set, we'll run each test in `mpt_chall` once for each model with the same `vm_test_ds` dataset to compare them:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt_chall:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model,vm_rf_model]\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Based on the performance metrics, we can conclude that the random forest classification model is not a viable candidate for our use case and can be disregarded in our tests going forward.</b></span>\n", + "<br></br>\n", + "In the next section, we'll dive a bit deeper into some tests comparing our champion application scorecard model and our remaining challenger logistic regression model, including tests that will allow us to customize parameters and thresholds for performance standards.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Adjust a ValidMind test\n", + "\n", + "Let's dig deeper into the `MinimumF1Score` test we ran previously in Run performance tests to ensure that the models maintain a minimum acceptable balance between *precision* and *recall*. Precision refers to how many out of the positive predictions made by the model were actually correct, and recall refers to how many out of the actual positive cases did the model correctly identify.\n", + "\n", + "Use `run_test()` with our testing dataset (`vm_test_ds`) to run the test in isolation again for our two remaining models without logging the result to have the output to compare with a subsequent iteration:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_log_model]\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As `MinimumF1Score` allows us to customize parameters and thresholds for performance standards, let's adjust the threshold to see if it improves metrics:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:AdjThreshold\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_log_model],\n", + " \"params\": {\"min_threshold\": 0.35}\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run diagnostic tests\n", + "\n", + "Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n", + "\n", + "Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if models suffer from any *overfit* potentials and also where there are potential sub-segments of issues with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html). \n", + "\n", + "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:Champion_vs_LogRegression\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgb_model,vm_log_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html).\n", + "\n", + "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:Champion_vs_LogRegression\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgb_model,vm_log_model]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Run feature importance tests\n", + "\n", + "We also want to verify the relative influence of different input features on our models' predictions, as well as inspect the differences between our champion and challenger model to see if a certain model offers more understandable or logical importance scores for features.\n", + "\n", + "Use `list_tests()` to identify all the feature importance tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Store the feature importance tests\n", + "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", + "FI" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Run and log our feature importance tests for both models for the testing dataset\n", + "for test in FI:\n", + " vm.tests.run_test(\n", + " \"\".join((test,':Champion_vs_LogisticRegression')),\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model]\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Implement a custom test\n", + "\n", + "Let's finish up testing by implementing a custom *inline test* that outputs a FICO score-type score. An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", + "\n", + "The [`@vm.test` wrapper](https://docs.validmind.ai/validmind/validmind.html#test) allows you to create a reusable test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import plotly.graph_objects as go\n", + "\n", + "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", + "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", + " \"\"\"\n", + " Analyzes the relationship between score bands and odds (good:bad ratio).\n", + " Good odds = (1 - default_rate) / default_rate\n", + " \n", + " Higher scores should correspond to higher odds of being good.\n", + "\n", + " If there are multiple scores provided through score_column, this means that there are two different models and the scores reflect each model\n", + "\n", + " If there are more scores provided in the score_column then focus the assessment on the differences between the two scores and indicate through evidence which one is preferred.\n", + " \"\"\"\n", + " df = dataset.df\n", + " \n", + " # Create score bands\n", + " df['score_band'] = pd.cut(\n", + " df[score_column],\n", + " bins=[-np.inf] + score_bands + [np.inf],\n", + " labels=[f'<{score_bands[0]}'] + \n", + " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", + " [f'>{score_bands[-1]}']\n", + " )\n", + " \n", + " # Calculate metrics per band\n", + " results = df.groupby('score_band').agg({\n", + " dataset.target_column: ['mean', 'count']\n", + " })\n", + " \n", + " results.columns = ['Default Rate', 'Total']\n", + " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", + " results['Bad Count'] = results['Default Rate'] * results['Total']\n", + " results['Odds'] = results['Good Count'] / results['Bad Count']\n", + " \n", + " # Create visualization\n", + " fig = go.Figure()\n", + " \n", + " # Add odds bars\n", + " fig.add_trace(go.Bar(\n", + " name='Odds (Good:Bad)',\n", + " x=results.index,\n", + " y=results['Odds'],\n", + " marker_color='blue'\n", + " ))\n", + " \n", + " fig.update_layout(\n", + " title='Score-to-Odds Analysis',\n", + " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", + " showlegend=False\n", + " )\n", + " \n", + " return fig" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With the custom test available, run and log the test for our champion and challenger models with our testing dataset (`vm_test_ds`):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.ScoreToOdds:Champion_vs_Challenger\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + " param_grid={\n", + " \"score_column\": [\"xgb_scores\",\"log_scores\"],\n", + " \"score_bands\": [[500, 540, 570]],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", + "<br></br>\n", + "Refer to our in-depth introduction to custom tests: <a href=\"../../how_to/tests/custom_tests/implement_custom_tests.ipynb\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc11__'></a>\n", + "\n", + "## Verify test runs\n", + "\n", + "Our final task is to verify that all the tests provided by the development team were run and reported accurately. Note the appended `result_ids` to delineate which dataset we ran the test with for the relevant tests.\n", + "\n", + "Here, we'll specify all the tests we'd like to independently rerun in a dictionary called `test_config`. **Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = {\n", + " # Run with the raw dataset\n", + " 'validmind.data_validation.DatasetDescription:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.Duplicates:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.HighCardinality:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {\n", + " 'num_threshold': 100,\n", + " 'percent_threshold': 0.1,\n", + " 'threshold_type': 'percent'\n", + " }\n", + " },\n", + " 'validmind.data_validation.Skewness:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_percent_threshold': 0.03}\n", + " },\n", + " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'threshold': 5}\n", + " },\n", + " # Run with the preprocessed dataset\n", + " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'default_column': 'loan_status'}\n", + " },\n", + " # Run with the training and test datasets\n", + " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.MutualInformation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_threshold': 0.01}\n", + " },\n", + " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", + " },\n", + " 'validmind.model_validation.ModelMetadata': {\n", + " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ModelParameters': {\n", + " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ROCCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']},\n", + " 'params': {'min_threshold': 0.5}\n", + " }\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then batch run and log our tests in `test_config`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for t in test_config:\n", + " print(t)\n", + " try:\n", + " # Check if test has input_grid\n", + " if 'input_grid' in test_config[t]:\n", + " # For tests with input_grid, pass the input_grid configuration\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", + " else:\n", + " # Original logic for regular inputs\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", + " except Exception as e:\n", + " print(f\"Error running test {t}: {str(e)}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12__'></a>\n", + "\n", + "## Next steps" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12_1__'></a>\n", + "\n", + "### Work with your validation report\n", + "\n", + "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", + "\n", + "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc13__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-7c52ad62bcf7411eaaa00aefbac6c756" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } From 229efb630f8e008fd52492191eefff7adf8584bc Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Thu, 21 May 2026 11:18:30 -0700 Subject: [PATCH 03/13] Replace monitoring-framing Key concepts across 3 notebooks Replay of 421200b5 on fresh main. Updates the monitoring-framing Key concepts block to the new record/model/ongoing monitoring report terminology across 3 monitoring notebooks (_about-validmind-monitoring, application_scorecard_ongoing_monitoring, quickstart_customer_churn_ongoing_monitoring). Adds the new `ongoing monitoring report` and `monitoring template, monitoring report template` terms. TOC anchors preserved where present. Co-authored-by: Cursor <cursoragent@cursor.com> --- .../_about-validmind-monitoring.ipynb | 160 +- ...ication_scorecard_ongoing_monitoring.ipynb | 2788 +++++++++-------- ...rt_customer_churn_ongoing_monitoring.ipynb | 1818 +++++------ 3 files changed, 2392 insertions(+), 2374 deletions(-) diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index 18fa2048c..e604d49db 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -1,80 +1,86 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "about-intro", - "metadata": {}, - "source": [ - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "about-intro" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "about-begin" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "about-signup" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "about-concepts" + } + ], + "metadata": { + "language_info": { + "name": "python" + } }, - { - "cell_type": "markdown", - "id": "about-begin", - "metadata": {}, - "source": [ - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "about-signup", - "metadata": {}, - "source": [ - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "about-concepts", - "metadata": {}, - "source": [ - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Model monitoring report**: A comprehensive and structured record of a production model, including key elements such as data sources, inputs, performance metrics, and periodic evaluations. This documentation ensures transparency and visibility of the model's performance in the production environment.\n", - "\n", - "**Monitoring report template**: Similar to documentation template, The monitoring report template functions as a test suite and lays out the structure of model monitoring, segmented into various sections and sub-sections. Monitoring report templates define the structure of your model monitoring report, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - } - ], - "metadata": { - "language_info": { - "name": "python" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index a420430b1..78c8a4da4 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -1,1393 +1,1399 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Ongoing Monitoring for Application Scorecard\n", - "\n", - "In this notebook, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", - "\n", - "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply monitoring report template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the monitoring report template](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - " - [Preview the monitoring template](#toc2_5__) \n", - "- [Load the reference and monitoring datasets](#toc3__) \n", - "- [Train the model](#toc4__) \n", - " - [Initialize the ValidMind datasets](#toc4_1__) \n", - " - [Initialize the ValidMind model](#toc4_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc4_3__) \n", - " - [Compute credit risk scores](#toc4_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc4_5__) \n", - " - [Monitoring data description](#toc4_6__) \n", - " - [Target and feature drift](#toc4_7__) \n", - " - [Classification accuracy](#toc4_8__) \n", - " - [Class discrimination](#toc4_9__) \n", - " - [Scoring](#toc4_10__) \n", - " - [Model insights](#toc4_11__) \n", - " - [Diagnostic monitoring](#toc4_12__) \n", - " - [Robustness monitoring](#toc4_13__) \n", - " - [Performance history](#toc4_14__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Model monitoring report**: A comprehensive and structured record of a production model, including key elements such as data sources, inputs, performance metrics, and periodic evaluations. This documentation ensures transparency and visibility of the model's performance in the production environment.\n", - "\n", - "**Monitoring report template**: Similar to documentation template, The monitoring report template functions as a test suite and lays out the structure of model monitoring, segmented into various sections and sub-sections. Monitoring report templates define the structure of your model monitoring report, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply monitoring report template\n", - "\n", - "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", - "\n", - " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"monitoring\",\n", - " monitoring = True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the monitoring report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import numpy as np\n", - "\n", - "from datetime import datetime, timedelta\n", - "\n", - "from validmind.tests import run_test\n", - "from validmind.datasets.credit_risk import lending_club\n", - "from validmind.unit_metrics import list_metrics\n", - "from validmind.unit_metrics import describe_metric\n", - "from validmind.unit_metrics import run_metric\n", - "from validmind.api_client import log_metric\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5__'></a>\n", - "\n", - "### Preview the monitoring template\n", - "\n", - "A template predefines sections for your monitoring documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "You will upload documentation and test results into this template later on. For now, take a look at the structure that the template provides with the `vm.preview_template()` function from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the reference and monitoring datasets\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test dataset splits as `reference` and `monitoring` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")\n", - "\n", - "# Compute probabilities\n", - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Compute binary predictions\n", - "cut_off_threshold = 0.3\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — The raw dataset that you want to provide as input to tests.\n", - "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_reference_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"reference_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_monitoring_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"monitoring_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_reference_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_monitoring_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_reference_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_monitoring_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Monitoring data description\n", - "\n", - "The Monitoring Data Description tests aim to provide a comprehensive statistical analysis of the monitoring dataset's characteristics. These tests examine the basic statistical properties, identify any missing data patterns, assess data uniqueness, visualize numerical feature distributions, and evaluate feature relationships through correlation analysis.\n", - "\n", - "The primary objective is to establish a baseline understanding of the monitoring data's structure and quality, enabling the detection of any significant deviations from expected patterns that could impact model performance. Each test is designed to capture different aspects of the data, from univariate statistics to multivariate relationships, providing a foundation for ongoing data quality assessment in the production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"feature_columns\": vm_monitoring_ds.feature_columns,\n", - " \"max_threshold\": 0.5,\n", - " \"top_n_correlations\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassImbalanceDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 1\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Target and feature drift\n", - "\n", - "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", - "\n", - "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", - "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we want see difference in correlation pairs between model prediction and features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally for target drift, let's plot each prediction value and feature grid side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.FeatureDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"psi_threshold\": 0.2,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_8__'></a>\n", - "\n", - "### Classification accuracy\n", - "\n", - "We now evaluate the model's predictive performance by comparing its behavior between reference and monitoring datasets. These tests analyze shifts in overall accuracy metrics, examine changes in the confusion matrix to identify specific classification pattern changes, and assess the model's probability calibration across different prediction thresholds. \n", - "\n", - "The primary objective is to detect any degradation in the model's classification performance that might indicate reliability issues in production. The tests provide both aggregate performance metrics and detailed breakdowns of prediction patterns, enabling the identification of specific areas where the model's accuracy might be deteriorating." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassificationAccuracyDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ConfusionMatrixDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.CalibrationCurveDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"n_bins\": 10,\n", - " \"drift_pct_threshold\": 10,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_9__'></a>\n", - "\n", - "### Class discrimination\n", - "\n", - "The following tests assess the model's ability to effectively separate different classes in both reference and monitoring datasets. These tests analyze the model's discriminative power by examining the separation between class distributions, evaluating changes in the ROC curve characteristics, comparing probability distribution patterns, and assessing cumulative prediction trends. \n", - "\n", - "The primary objective is to identify any deterioration in the model's ability to distinguish between classes, which could indicate a decline in model effectiveness. The tests examine both the overall discriminative capability and the granular patterns in prediction distributions, providing insights into whether the model maintains its ability to effectively differentiate between classes in the production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassDiscriminationDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ROCCurveDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 10,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_10__'></a>\n", - "\n", - "### Scoring\n", - "\n", - "Next we analyze the distribution and stability of credit scores across reference and monitoring datasets. These tests evaluate shifts in score distributions, examine changes in score band populations, and assess the relationship between scores and default rates. \n", - "\n", - "The primary objective is to identify any significant changes in how the model assigns credit scores, which could indicate drift in risk assessment capabilities. The tests examine both the overall score distribution patterns and the specific performance within defined score bands, providing insights into whether the model maintains consistent and reliable risk segmentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ScorecardHistogramDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"drift_pct_threshold\": 20,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ScoreBandsDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " \"drift_pct_threshold\": 20,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_11__'></a>\n", - "\n", - "### Model insights" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": [vm_xgb_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.FeaturesAUC\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_12__'></a>\n", - "\n", - "### Diagnostic monitoring" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_13__'></a>\n", - "\n", - "### Robustness monitoring" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"scaling_factor_std_dev_list\": [\n", - " 0.1,\n", - " 0.2,\n", - " 0.3,\n", - " 0.4,\n", - " 0.5\n", - " ],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_14__'></a>\n", - "\n", - "### Performance history\n", - "\n", - "In this section we showcase how to track and visualize the temporal evolution of key model performance metrics, including AUC, F1 score, precision, recall, and accuracy. For demonstration purposes, the section simulates historical performance data by introducing a gradual downward trend and random noise to these metrics over a specified time period. These tests are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time. \n", - "\n", - "The main goal is to maintain a continuous record of model performance that can be used to detect gradual drift, sudden changes, or cyclical patterns in model effectiveness. This temporal monitoring approach provides early warning signals of potential issues and helps establish whether the model maintains consistent performance within acceptable boundaries throughout its deployment period." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", - "\n", - "for metric_id in metrics:\n", - " describe_metric(metric_id)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.ROC_AUC\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "auc = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Accuracy\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "accuracy = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Recall\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "recall = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "f1 = run_metric(\n", - " \"validmind.unit_metrics.classification.F1\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "f1 = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "precision = run_metric(\n", - " \"validmind.unit_metrics.classification.Precision\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "precision = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "NUM_DAYS = 10\n", - "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", - "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", - "\n", - "\n", - "# Initial values\n", - "performance_metrics = {\n", - " \"AUC Score\": auc,\n", - " \"F1 Score\": f1,\n", - " \"Precision Score\": precision,\n", - " \"Recall Score\": recall,\n", - " \"Accuracy Score\": accuracy\n", - "}\n", - "\n", - "# Trend parameters\n", - "trend_factor = 0.98 # Slight downward trend (multiply by 0.98 each step)\n", - "noise_scale = 0.02 # Random fluctuation of ±2%\n", - "\n", - "\n", - "for i in range(NUM_DAYS):\n", - " recorded_at = base_date + timedelta(days=i)\n", - " print(f\"\\nrecorded_at: {recorded_at}\")\n", - "\n", - " # Log each metric with trend and noise\n", - " for metric_name, base_value in performance_metrics.items():\n", - " # Apply trend and add random noise\n", - " trend = base_value * (trend_factor ** i)\n", - " noise = np.random.normal(0, noise_scale * base_value)\n", - " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", - " \n", - " log_metric(\n", - " key=metric_name,\n", - " value=value,\n", - " recorded_at=recorded_at.isoformat()\n", - " )\n", - " \n", - " print(f\"{metric_name:<15}: {value:.4f}\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a1aa6fcedbed410099c3b537625ad59b", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Ongoing Monitoring for Application Scorecard\n", + "\n", + "In this notebook, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", + "\n", + "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply monitoring report template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the monitoring report template](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + " - [Preview the monitoring template](#toc2_5__) \n", + "- [Load the reference and monitoring datasets](#toc3__) \n", + "- [Train the model](#toc4__) \n", + " - [Initialize the ValidMind datasets](#toc4_1__) \n", + " - [Initialize the ValidMind model](#toc4_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc4_3__) \n", + " - [Compute credit risk scores](#toc4_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc4_5__) \n", + " - [Monitoring data description](#toc4_6__) \n", + " - [Target and feature drift](#toc4_7__) \n", + " - [Classification accuracy](#toc4_8__) \n", + " - [Class discrimination](#toc4_9__) \n", + " - [Scoring](#toc4_10__) \n", + " - [Model insights](#toc4_11__) \n", + " - [Diagnostic monitoring](#toc4_12__) \n", + " - [Robustness monitoring](#toc4_13__) \n", + " - [Performance history](#toc4_14__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply monitoring report template\n", + "\n", + "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", + "\n", + " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"monitoring\",\n", + " monitoring = True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the monitoring report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import numpy as np\n", + "\n", + "from datetime import datetime, timedelta\n", + "\n", + "from validmind.tests import run_test\n", + "from validmind.datasets.credit_risk import lending_club\n", + "from validmind.unit_metrics import list_metrics\n", + "from validmind.unit_metrics import describe_metric\n", + "from validmind.unit_metrics import run_metric\n", + "from validmind.api_client import log_metric\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5__'></a>\n", + "\n", + "### Preview the monitoring template\n", + "\n", + "A template predefines sections for your monitoring documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "You will upload documentation and test results into this template later on. For now, take a look at the structure that the template provides with the `vm.preview_template()` function from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the reference and monitoring datasets\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test dataset splits as `reference` and `monitoring` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")\n", + "\n", + "# Compute probabilities\n", + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Compute binary predictions\n", + "cut_off_threshold = 0.3\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — The raw dataset that you want to provide as input to tests.\n", + "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_reference_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"reference_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_monitoring_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"monitoring_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_reference_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_monitoring_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_reference_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_monitoring_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Monitoring data description\n", + "\n", + "The Monitoring Data Description tests aim to provide a comprehensive statistical analysis of the monitoring dataset's characteristics. These tests examine the basic statistical properties, identify any missing data patterns, assess data uniqueness, visualize numerical feature distributions, and evaluate feature relationships through correlation analysis.\n", + "\n", + "The primary objective is to establish a baseline understanding of the monitoring data's structure and quality, enabling the detection of any significant deviations from expected patterns that could impact model performance. Each test is designed to capture different aspects of the data, from univariate statistics to multivariate relationships, providing a foundation for ongoing data quality assessment in the production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"feature_columns\": vm_monitoring_ds.feature_columns,\n", + " \"max_threshold\": 0.5,\n", + " \"top_n_correlations\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassImbalanceDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 1\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Target and feature drift\n", + "\n", + "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", + "\n", + "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", + "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we want see difference in correlation pairs between model prediction and features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally for target drift, let's plot each prediction value and feature grid side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.FeatureDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"psi_threshold\": 0.2,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_8__'></a>\n", + "\n", + "### Classification accuracy\n", + "\n", + "We now evaluate the model's predictive performance by comparing its behavior between reference and monitoring datasets. These tests analyze shifts in overall accuracy metrics, examine changes in the confusion matrix to identify specific classification pattern changes, and assess the model's probability calibration across different prediction thresholds. \n", + "\n", + "The primary objective is to detect any degradation in the model's classification performance that might indicate reliability issues in production. The tests provide both aggregate performance metrics and detailed breakdowns of prediction patterns, enabling the identification of specific areas where the model's accuracy might be deteriorating." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassificationAccuracyDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ConfusionMatrixDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.CalibrationCurveDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"n_bins\": 10,\n", + " \"drift_pct_threshold\": 10,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_9__'></a>\n", + "\n", + "### Class discrimination\n", + "\n", + "The following tests assess the model's ability to effectively separate different classes in both reference and monitoring datasets. These tests analyze the model's discriminative power by examining the separation between class distributions, evaluating changes in the ROC curve characteristics, comparing probability distribution patterns, and assessing cumulative prediction trends. \n", + "\n", + "The primary objective is to identify any deterioration in the model's ability to distinguish between classes, which could indicate a decline in model effectiveness. The tests examine both the overall discriminative capability and the granular patterns in prediction distributions, providing insights into whether the model maintains its ability to effectively differentiate between classes in the production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassDiscriminationDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ROCCurveDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 10,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_10__'></a>\n", + "\n", + "### Scoring\n", + "\n", + "Next we analyze the distribution and stability of credit scores across reference and monitoring datasets. These tests evaluate shifts in score distributions, examine changes in score band populations, and assess the relationship between scores and default rates. \n", + "\n", + "The primary objective is to identify any significant changes in how the model assigns credit scores, which could indicate drift in risk assessment capabilities. The tests examine both the overall score distribution patterns and the specific performance within defined score bands, providing insights into whether the model maintains consistent and reliable risk segmentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ScorecardHistogramDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"drift_pct_threshold\": 20,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ScoreBandsDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " \"drift_pct_threshold\": 20,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11__'></a>\n", + "\n", + "### Model insights" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": [vm_xgb_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.FeaturesAUC\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_12__'></a>\n", + "\n", + "### Diagnostic monitoring" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_13__'></a>\n", + "\n", + "### Robustness monitoring" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"scaling_factor_std_dev_list\": [\n", + " 0.1,\n", + " 0.2,\n", + " 0.3,\n", + " 0.4,\n", + " 0.5\n", + " ],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_14__'></a>\n", + "\n", + "### Performance history\n", + "\n", + "In this section we showcase how to track and visualize the temporal evolution of key model performance metrics, including AUC, F1 score, precision, recall, and accuracy. For demonstration purposes, the section simulates historical performance data by introducing a gradual downward trend and random noise to these metrics over a specified time period. These tests are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time. \n", + "\n", + "The main goal is to maintain a continuous record of model performance that can be used to detect gradual drift, sudden changes, or cyclical patterns in model effectiveness. This temporal monitoring approach provides early warning signals of potential issues and helps establish whether the model maintains consistent performance within acceptable boundaries throughout its deployment period." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", + "\n", + "for metric_id in metrics:\n", + " describe_metric(metric_id)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.ROC_AUC\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "auc = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Accuracy\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "accuracy = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Recall\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "recall = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "f1 = run_metric(\n", + " \"validmind.unit_metrics.classification.F1\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "f1 = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "precision = run_metric(\n", + " \"validmind.unit_metrics.classification.Precision\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "precision = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "NUM_DAYS = 10\n", + "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", + "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", + "\n", + "\n", + "# Initial values\n", + "performance_metrics = {\n", + " \"AUC Score\": auc,\n", + " \"F1 Score\": f1,\n", + " \"Precision Score\": precision,\n", + " \"Recall Score\": recall,\n", + " \"Accuracy Score\": accuracy\n", + "}\n", + "\n", + "# Trend parameters\n", + "trend_factor = 0.98 # Slight downward trend (multiply by 0.98 each step)\n", + "noise_scale = 0.02 # Random fluctuation of ±2%\n", + "\n", + "\n", + "for i in range(NUM_DAYS):\n", + " recorded_at = base_date + timedelta(days=i)\n", + " print(f\"\\nrecorded_at: {recorded_at}\")\n", + "\n", + " # Log each metric with trend and noise\n", + " for metric_name, base_value in performance_metrics.items():\n", + " # Apply trend and add random noise\n", + " trend = base_value * (trend_factor ** i)\n", + " noise = np.random.normal(0, noise_scale * base_value)\n", + " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", + " \n", + " log_metric(\n", + " key=metric_name,\n", + " value=value,\n", + " recorded_at=recorded_at.isoformat()\n", + " )\n", + " \n", + " print(f\"{metric_name:<15}: {value:.4f}\")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a1aa6fcedbed410099c3b537625ad59b" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index 839da8e5f..a5ea5e9a6 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -1,908 +1,914 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Quickstart for ongoing monitoring of models with ValidMind\n", - "\n", - "Welcome! In this quickstart guide, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", - "\n", - "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model.\n", - "\n", - "This notebook utilizes the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) dataset from Kaggle to train a simple classification model for demonstration purposes." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply monitoring report template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the monitoring report template](#toc2_4__) \n", - "- [Load the reference and monitoring datasets](#toc3__) \n", - " - [Load the production model](#toc3_1__) \n", - " - [Initialize the ValidMind datasets](#toc3_2__) \n", - " - [Initialize the ValidMind model](#toc3_3__) \n", - " - [Assign predictions to the datasets](#toc3_4__) \n", - " - [Run the ongoing monitoring tests](#toc3_5__) \n", - " - [Conduct target and feature drift testing](#toc3_6__) \n", - " - [Feature drift tests](#toc3_6_1__) \n", - " - [Model performance monitoring tests](#toc3_7__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your monitoring report](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Model monitoring report**: A comprehensive and structured record of a production model, including key elements such as data sources, inputs, performance metrics, and periodic evaluations. This documentation ensures transparency and visibility of the model's performance in the production environment.\n", - "\n", - "**Monitoring report template**: Similar to documentation template, The monitoring report template functions as a test suite and lays out the structure of model monitoring, segmented into various sections and sub-sections. Monitoring report templates define the structure of your model monitoring report, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply monitoring report template\n", - "\n", - "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", - "\n", - " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"monitoring\",\n", - " monitoring = True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import validmind as vm\n", - "import pandas as pd\n", - "import numpy as np\n", - "import seaborn as sns\n", - "import matplotlib.pyplot as plt\n", - "from validmind.tests import run_test\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the monitoring report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the reference and monitoring datasets\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test and validation dataset splits as `training`, `reference` and `monitoring` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "\n", - "train_df, reference_df, monitor_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Load the production model\n", - "\n", - "We will also load a pre-trained model for demonstration purposes. This is a simple XGBoost model trained on the Bank Customer Churn Prediction dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "# Load the saved model\n", - "model = xgb.XGBClassifier()\n", - "model.load_model(\"xgboost_model.model\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — The raw dataset that you want to provide as input to tests.\n", - "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- `class_labels` — An optional value to map predicted classes to class labels.\n", - "\n", - "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`train_df`, `reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_df\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_reference_ds = vm.init_dataset(\n", - " dataset=reference_df,\n", - " input_id=\"reference_df\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_monitor_ds = vm.init_dataset(\n", - " dataset=monitor_df,\n", - " input_id=\"monitor_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_reference_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_monitor_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_5__'></a>\n", - "\n", - "### Run the ongoing monitoring tests\n", - "\n", - "Before we start the testing procedure, let's take a look at the expected tests that are pre-configured:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_list = vm.get_test_suite().get_default_config()\n", - "for l in test_list:\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's run the first test in the list. Note that you can use `vm.tests.describe_test()` to get information about the inputs required for the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.describe_test(\"validmind.model_validation.ModelMetadata\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see, the `ModelMetadata` only requires a model input. Let's run the test and log the results into the monitoring document with the `.log()` method:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_result = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " model=vm_model,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's run the tests needed to determine data quality of the monitoring dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data_qual = vm.get_test_suite(\n", - " section=\"prediction_data_description\"\n", - ").get_default_config()\n", - "\n", - "# Run all of the necessary data quality checks where the monitoring dataset is the basis\n", - "for l in data_qual:\n", - " vm.tests.run_test(\n", - " l,\n", - " inputs={\"dataset\": vm_monitor_ds},\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test: {0}\".format(l))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To view the results of the model metadata and data quality tests, select **Monitoring** under Documents in the left sidebar of the model in the ValidMind Platform and click on the following sections:\n", - "\n", - "- 1. Model Monitoring Overview > **1.2. Model Details**\n", - "- 2. Data Quality & Drift Assessment > **2.1. Prediction Data Description**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, let's run *comparison tests*, which will allow comparing differences between the training dataset and monitoring datasets. To run a test in comparison mode, you only need to pass an `input_grid` parameter to the `run_test()` method instead of `inputs`.\n", - "\n", - "For more information about comparison tests, see this [notebook](../../how_to/tests/run_tests/2-run_comparison_tests.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "correlation_tests = [\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:train_vs_test\",\n", - " \"validmind.data_validation.HighPearsonCorrelation:train_vs_test\",\n", - "]\n", - "\n", - "for test in correlation_tests:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test {0}\".format(test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can view these results in the ValidMind Platform in **Ongoing Monitoring** within Documents under the following section:\n", - "\n", - "- 2. Data Quality & Drift Assessment > **2.2. Prediction Data Correlations and Interactions**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_6__'></a>\n", - "\n", - "### Conduct target and feature drift testing\n", - "\n", - "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", - "\n", - "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", - "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the 2. Data Quality & Drift Assessment > **2.3 Target Drift** section we can confirm only there is only one pre-configured test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for l in vm.get_test_suite(section=\"comparison_data_target\").get_default_config():\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As part of running the rest of the tests, we will directly log the results to a section when calling the `.log()` method." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, let's run the *Population Stability Index (PSI)* for predictions. In this case, we want to compare the test data with the monitoring data. (Note: For predictions, the training data is irrelevant.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we want see difference in correlation pairs between model prediction and features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally for target drift, let's plot each prediction value and feature grid side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.PredictionAcrossEachFeature\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_6_1__'></a>\n", - "\n", - "#### Feature drift tests\n", - "\n", - "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data. These results will be logged in the 2. Data Quality & Drift Assessment > **2.4. Feature Drift** section." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.FeatureDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_feature\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_7__'></a>\n", - "\n", - "### Model performance monitoring tests\n", - "\n", - "Let's wrap up by monitoring the model's performance. Keep in mind that in some cases, it may not be possible to determine accuracy if the ground truth is unavailable. If this is the case, you can skip this test and instead focus on target and feature drift to inform the model owners.\n", - "\n", - "The pre-configured tests for model performance are:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for l in vm.get_test_suite(section=\"model_performance_monitoring\").get_default_config():\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The code below will run the tests and log the results into the monitoring document for each of the tests. Note the use of `input_grid` again, which is required for comparison tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Use the reference dataset vs monitoring dataset - the true comparison of accuracy\n", - "for test in vm.get_test_suite(\n", - " section=\"model_performance_monitoring\"\n", - ").get_default_config():\n", - " if test == \"validmind.model_validation.statsmodels.GINITable\":\n", - " vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " else:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test: {0}\".format(test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your monitoring report.\n", - "\n", - "<a id='toc4_1__'></a>\n", - "\n", - "### Work with your monitoring report\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Monitoring** under Documents.\n", - "\n", - "What you see is the full draft of your monitoring report in a more easily consumable version. From here, you can make qualitative edits to monitoring reports, view guidelines, review monitoring results, and submit your monitoring report for approval when it's ready. (**Learn more:** [Ongoing monitoring](https://docs.validmind.ai/guide/monitoring/ongoing-monitoring.html))\n", - "\n", - "<a id='toc4_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-06926ffb7c9846eca24d1130049d6316", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for ongoing monitoring of models with ValidMind\n", + "\n", + "Welcome! In this quickstart guide, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", + "\n", + "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model.\n", + "\n", + "This notebook utilizes the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) dataset from Kaggle to train a simple classification model for demonstration purposes." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply monitoring report template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the monitoring report template](#toc2_4__) \n", + "- [Load the reference and monitoring datasets](#toc3__) \n", + " - [Load the production model](#toc3_1__) \n", + " - [Initialize the ValidMind datasets](#toc3_2__) \n", + " - [Initialize the ValidMind model](#toc3_3__) \n", + " - [Assign predictions to the datasets](#toc3_4__) \n", + " - [Run the ongoing monitoring tests](#toc3_5__) \n", + " - [Conduct target and feature drift testing](#toc3_6__) \n", + " - [Feature drift tests](#toc3_6_1__) \n", + " - [Model performance monitoring tests](#toc3_7__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your monitoring report](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply monitoring report template\n", + "\n", + "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", + "\n", + " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"monitoring\",\n", + " monitoring = True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import validmind as vm\n", + "import pandas as pd\n", + "import numpy as np\n", + "import seaborn as sns\n", + "import matplotlib.pyplot as plt\n", + "from validmind.tests import run_test\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the monitoring report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the reference and monitoring datasets\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test and validation dataset splits as `training`, `reference` and `monitoring` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "\n", + "train_df, reference_df, monitor_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Load the production model\n", + "\n", + "We will also load a pre-trained model for demonstration purposes. This is a simple XGBoost model trained on the Bank Customer Churn Prediction dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "# Load the saved model\n", + "model = xgb.XGBClassifier()\n", + "model.load_model(\"xgboost_model.model\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — The raw dataset that you want to provide as input to tests.\n", + "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- `class_labels` — An optional value to map predicted classes to class labels.\n", + "\n", + "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`train_df`, `reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_df\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_reference_ds = vm.init_dataset(\n", + " dataset=reference_df,\n", + " input_id=\"reference_df\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_monitor_ds = vm.init_dataset(\n", + " dataset=monitor_df,\n", + " input_id=\"monitor_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_reference_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_monitor_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_5__'></a>\n", + "\n", + "### Run the ongoing monitoring tests\n", + "\n", + "Before we start the testing procedure, let's take a look at the expected tests that are pre-configured:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_list = vm.get_test_suite().get_default_config()\n", + "for l in test_list:\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's run the first test in the list. Note that you can use `vm.tests.describe_test()` to get information about the inputs required for the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.model_validation.ModelMetadata\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, the `ModelMetadata` only requires a model input. Let's run the test and log the results into the monitoring document with the `.log()` method:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_result = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " model=vm_model,\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's run the tests needed to determine data quality of the monitoring dataset:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "data_qual = vm.get_test_suite(\n", + " section=\"prediction_data_description\"\n", + ").get_default_config()\n", + "\n", + "# Run all of the necessary data quality checks where the monitoring dataset is the basis\n", + "for l in data_qual:\n", + " vm.tests.run_test(\n", + " l,\n", + " inputs={\"dataset\": vm_monitor_ds},\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test: {0}\".format(l))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To view the results of the model metadata and data quality tests, select **Monitoring** under Documents in the left sidebar of the model in the ValidMind Platform and click on the following sections:\n", + "\n", + "- 1. Model Monitoring Overview > **1.2. Model Details**\n", + "- 2. Data Quality & Drift Assessment > **2.1. Prediction Data Description**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's run *comparison tests*, which will allow comparing differences between the training dataset and monitoring datasets. To run a test in comparison mode, you only need to pass an `input_grid` parameter to the `run_test()` method instead of `inputs`.\n", + "\n", + "For more information about comparison tests, see this [notebook](../../how_to/tests/run_tests/2-run_comparison_tests.ipynb)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "correlation_tests = [\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:train_vs_test\",\n", + " \"validmind.data_validation.HighPearsonCorrelation:train_vs_test\",\n", + "]\n", + "\n", + "for test in correlation_tests:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test {0}\".format(test))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can view these results in the ValidMind Platform in **Ongoing Monitoring** within Documents under the following section:\n", + "\n", + "- 2. Data Quality & Drift Assessment > **2.2. Prediction Data Correlations and Interactions**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_6__'></a>\n", + "\n", + "### Conduct target and feature drift testing\n", + "\n", + "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", + "\n", + "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", + "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the 2. Data Quality & Drift Assessment > **2.3 Target Drift** section we can confirm only there is only one pre-configured test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for l in vm.get_test_suite(section=\"comparison_data_target\").get_default_config():\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As part of running the rest of the tests, we will directly log the results to a section when calling the `.log()` method." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, let's run the *Population Stability Index (PSI)* for predictions. In this case, we want to compare the test data with the monitoring data. (Note: For predictions, the training data is irrelevant.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we want see difference in correlation pairs between model prediction and features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally for target drift, let's plot each prediction value and feature grid side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.PredictionAcrossEachFeature\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_6_1__'></a>\n", + "\n", + "#### Feature drift tests\n", + "\n", + "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data. These results will be logged in the 2. Data Quality & Drift Assessment > **2.4. Feature Drift** section." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.FeatureDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_feature\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_7__'></a>\n", + "\n", + "### Model performance monitoring tests\n", + "\n", + "Let's wrap up by monitoring the model's performance. Keep in mind that in some cases, it may not be possible to determine accuracy if the ground truth is unavailable. If this is the case, you can skip this test and instead focus on target and feature drift to inform the model owners.\n", + "\n", + "The pre-configured tests for model performance are:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for l in vm.get_test_suite(section=\"model_performance_monitoring\").get_default_config():\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The code below will run the tests and log the results into the monitoring document for each of the tests. Note the use of `input_grid` again, which is required for comparison tests:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Use the reference dataset vs monitoring dataset - the true comparison of accuracy\n", + "for test in vm.get_test_suite(\n", + " section=\"model_performance_monitoring\"\n", + ").get_default_config():\n", + " if test == \"validmind.model_validation.statsmodels.GINITable\":\n", + " vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " else:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test: {0}\".format(test))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your monitoring report.\n", + "\n", + "<a id='toc4_1__'></a>\n", + "\n", + "### Work with your monitoring report\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Monitoring** under Documents.\n", + "\n", + "What you see is the full draft of your monitoring report in a more easily consumable version. From here, you can make qualitative edits to monitoring reports, view guidelines, review monitoring results, and submit your monitoring report for approval when it's ready. (**Learn more:** [Ongoing monitoring](https://docs.validmind.ai/guide/monitoring/ongoing-monitoring.html))\n", + "\n", + "<a id='toc4_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-06926ffb7c9846eca24d1130049d6316" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } From 7799fb5f68c1ce849b7870e900ad9fdac1dedff0 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Thu, 21 May 2026 11:28:31 -0700 Subject: [PATCH 04/13] Edit for beck/sc-15992/documentation-primary-record-types-glossary --- .../4-finalize_validation_reporting.ipynb | 194 +++++++++--------- 1 file changed, 97 insertions(+), 97 deletions(-) diff --git a/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb b/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb index 768c569b2..f91c428a4 100644 --- a/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb +++ b/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb @@ -121,7 +121,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Make sure the ValidMind Library is installed\n", "\n", @@ -143,9 +145,7 @@ " # model=\"...\",\n", " document=\"validation-report\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -160,7 +160,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Load the sample dataset\n", "from validmind.datasets.classification import customer_churn as demo_dataset\n", @@ -170,13 +172,13 @@ ")\n", "\n", "raw_df = demo_dataset.load_data()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the raw dataset for use in ValidMind tests\n", "vm_raw_dataset = vm.init_dataset(\n", @@ -184,13 +186,13 @@ " input_id=\"raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import pandas as pd\n", "\n", @@ -202,9 +204,7 @@ "\n", "balanced_raw_df = pd.concat([exited_df, not_exited_df])\n", "balanced_raw_df = balanced_raw_df.sample(frac=1, random_state=42)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -215,7 +215,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Register new data and now 'balanced_raw_dataset' is the new dataset object of interest\n", "vm_balanced_raw_dataset = vm.init_dataset(\n", @@ -223,13 +225,13 @@ " input_id=\"balanced_raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Run HighPearsonCorrelation test with our balanced dataset as input and return a result object\n", "corr_result = vm.tests.run_test(\n", @@ -237,46 +239,46 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_balanced_raw_dataset},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# From result object, extract table from `corr_result.tables`\n", "features_df = corr_result.tables[0].data\n", "features_df" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract list of features that failed the test\n", "high_correlation_features = features_df[features_df[\"Pass/Fail\"] == \"Fail\"][\"Columns\"].tolist()\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract feature names from the list of strings\n", "high_correlation_features = [feature.split(\",\")[0].strip(\"()\") for feature in high_correlation_features]\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Remove the highly correlated features from the dataset\n", "balanced_raw_no_age_df = balanced_raw_df.drop(columns=high_correlation_features)\n", @@ -287,13 +289,13 @@ " input_id=\"raw_dataset_preprocessed\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Re-run the test with the reduced feature set\n", "corr_result = vm.tests.run_test(\n", @@ -301,9 +303,7 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_raw_dataset_preprocessed},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -318,20 +318,22 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Encode categorical features in the dataset\n", "balanced_raw_no_age_df = pd.get_dummies(\n", " balanced_raw_no_age_df, columns=[\"Geography\", \"Gender\"], drop_first=True\n", ")\n", "balanced_raw_no_age_df.head()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", @@ -342,13 +344,13 @@ "y_train = train_df[\"Exited\"]\n", "X_test = test_df.drop(\"Exited\", axis=1)\n", "y_test = test_df[\"Exited\"]" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the split datasets\n", "vm_train_ds = vm.init_dataset(\n", @@ -362,9 +364,7 @@ " dataset=test_df,\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -379,16 +379,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the champion model\n", "import pickle as pkl\n", "\n", "with open(\"lr_model_champion.pkl\", \"rb\") as f:\n", " log_reg = pkl.load(f)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -403,7 +403,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the Random Forest Classification model\n", "from sklearn.ensemble import RandomForestClassifier\n", @@ -416,9 +418,7 @@ "\n", "# Train the model\n", "rf_model.fit(X_train, y_train)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -433,7 +433,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the champion logistic regression model\n", "vm_log_model = vm.init_model(\n", @@ -446,13 +448,13 @@ " rf_model,\n", " input_id=\"rf_model\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Assign predictions to Champion — Logistic regression model\n", "vm_train_ds.assign_predictions(model=vm_log_model)\n", @@ -461,9 +463,7 @@ "# Assign predictions to Challenger — Random forest classification model\n", "vm_train_ds.assign_predictions(model=vm_rf_model)\n", "vm_test_ds.assign_predictions(model=vm_rf_model)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -509,7 +509,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "from sklearn import metrics\n", @@ -523,9 +525,7 @@ " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", ")\n", "cm_display.plot()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -544,7 +544,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model):\n", @@ -572,9 +574,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -585,7 +585,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion train and test\n", "vm.tests.run_test(\n", @@ -595,13 +597,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger train and test\n", "vm.tests.run_test(\n", @@ -611,9 +613,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -637,7 +637,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model, normalize=False):\n", @@ -668,9 +670,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -690,7 +690,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -701,13 +703,13 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -718,9 +720,7 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -756,7 +756,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "tests_folder = \"my_tests\"\n", "\n", @@ -770,9 +772,7 @@ " # remove files and pycache\n", " if f.endswith(\".py\") or f == \"__pycache__\":\n", " os.system(f\"rm -rf {tests_folder}/{f}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -809,16 +809,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "confusion_matrix.save(\n", " # Save it to the custom tests folder we created\n", " tests_folder,\n", " imports=[\"import matplotlib.pyplot as plt\", \"from sklearn import metrics\"],\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -873,7 +873,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from validmind.tests import LocalTestProvider\n", "\n", @@ -886,9 +888,7 @@ ")\n", "# `my_test_provider.load_test()` will be called for any test ID that starts with `my_test_provider`\n", "# e.g. `my_test_provider.ConfusionMatrix` will look for a function named `ConfusionMatrix` in `my_tests/ConfusionMatrix.py` file" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -906,7 +906,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -916,13 +918,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -932,9 +934,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -951,7 +951,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "test_config = {\n", " # Run with the raw dataset\n", @@ -1061,9 +1063,7 @@ " 'params': {'min_threshold': 0.5}\n", " }\n", "}" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1074,7 +1074,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "for t in test_config:\n", " print(t)\n", @@ -1094,9 +1096,7 @@ " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", " except Exception as e:\n", " print(f\"Error running test {t}: {str(e)}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1149,7 +1149,7 @@ "\n", "- **Adding risk assessment notes:** Click under **Risk Assessment Notes** in any validation report section to access the text editor and content editing toolbar, including an option to generate a draft with AI. Once generated, edit your ValidMind-generated test descriptions to adhere to your organization's requirements. (Learn more: [Work with content blocks](https://docs.validmind.ai/guide/documentation/work-with-content-blocks.html#content-editing-toolbar))\n", "\n", - "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Provide compliance assessments](https://docs.validmind.ai/guide/validation/assess-compliance.html#provide-compliance-assessments))\n", + "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Assess compliance](https://docs.validmind.ai/guide/validation/assess-compliance.html#assign-compliance-assessments))\n", "\n", "- **Collaborate with other stakeholders:** Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including developers. Propose suggested changes in the documentation, work with versioned history, and use comments to discuss specific portions of the documentation. (Learn more: [Collaborate with others](https://docs.validmind.ai/guide/documentation/collaborate-with-others.html))\n", "\n", From 5591d3f357bf9fb25643f665f52a199f81b3ee84 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Thu, 21 May 2026 12:14:40 -0700 Subject: [PATCH 05/13] oops --- .../qualitative_text_generation.ipynb | 334 +++++++++--------- 1 file changed, 167 insertions(+), 167 deletions(-) diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 7c86798e7..4b9572b5e 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -2,6 +2,7 @@ "cells": [ { "cell_type": "markdown", + "id": "9a900020", "metadata": {}, "source": [ "# Generate qualitative text with the ValidMind library\n", @@ -9,11 +10,11 @@ "This notebook shows how to generate qualitative documentation content directly from the ValidMind library using both `vm.run_text_generation()` and `vm.generate_documentation_text()`. Instead of switching to the UI to write text manually or trigger generation one section at a time, you can generate content for documentation text blocks programmatically from within a notebook and log it back to the corresponding sections of the model document.\n", "\n", "After building an example model and documenting its quantitative results, we’ll show how to generate text for individual content blocks, customize the output with prompts, control the context used for generation, and use a configuration-driven workflow to populate multiple qualitative sections across the document. By the end, you’ll have an end-to-end example of how quantitative test results and AI-generated qualitative content can work together to populate a full model document from Python, giving you a more automated documentation workflow directly in the library." - ], - "id": "9a900020" + ] }, { "cell_type": "markdown", + "id": "cd48db57", "metadata": {}, "source": [ "::: {.content-hidden when-format=\"html\"}\n", @@ -59,11 +60,11 @@ "\tmaxLevel=4\n", "\t/jn-toc-notebook-config -->\n", "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ], - "id": "cd48db57" + ] }, { "cell_type": "markdown", + "id": "a67217b3", "metadata": {}, "source": [ "<a id='toc1__'></a>\n", @@ -73,11 +74,11 @@ "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", "\n", "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ], - "id": "a67217b3" + ] }, { "cell_type": "markdown", + "id": "281cfb86", "metadata": {}, "source": [ "<a id='toc1_1__'></a>\n", @@ -87,11 +88,11 @@ "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", "\n", "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ], - "id": "281cfb86" + ] }, { "cell_type": "markdown", + "id": "51c11b52", "metadata": {}, "source": [ "<a id='toc1_2__'></a>\n", @@ -103,11 +104,11 @@ "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", "<br></br>\n", "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ], - "id": "51c11b52" + ] }, { "cell_type": "markdown", + "id": "9103cd45", "metadata": {}, "source": [ "<a id='toc1_3__'></a>\n", @@ -126,7 +127,7 @@ "\n", "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", - "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [test_suites](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", @@ -142,21 +143,21 @@ "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", "\n", "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ], - "id": "9103cd45" + ] }, { "cell_type": "markdown", + "id": "23020a1b", "metadata": {}, "source": [ "<a id='toc2__'></a>\n", "\n", "## Setting up" - ], - "id": "23020a1b" + ] }, { "cell_type": "markdown", + "id": "6202d6dc", "metadata": {}, "source": [ "<a id='toc2_1__'></a>\n", @@ -168,31 +169,31 @@ "Python 3.8 <= x <= 3.14</div>\n", "\n", "To install the library:" - ], - "id": "6202d6dc" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "045b05a6", "metadata": {}, + "outputs": [], "source": [ "%pip install -q validmind" - ], - "execution_count": null, - "outputs": [], - "id": "045b05a6" + ] }, { "cell_type": "markdown", + "id": "b3231d8e", "metadata": {}, "source": [ "<a id='toc2_2__'></a>\n", "\n", "### Initialize the ValidMind Library" - ], - "id": "b3231d8e" + ] }, { "cell_type": "markdown", + "id": "56592217", "metadata": {}, "source": [ "<a id='toc2_2_1__'></a>\n", @@ -212,11 +213,11 @@ "5. Select your own name under the **RECORD OWNER** drop-down.\n", "\n", "6. Click **Register Model** to add the model to your inventory." - ], - "id": "56592217" + ] }, { "cell_type": "markdown", + "id": "43ed3d0c", "metadata": {}, "source": [ "<a id='toc2_2_2__'></a>\n", @@ -232,11 +233,11 @@ "2. Under **TEMPLATE**, select `Binary classification`.\n", "\n", "3. Click **Use Template** to apply the template." - ], - "id": "43ed3d0c" + ] }, { "cell_type": "markdown", + "id": "9b9203be", "metadata": {}, "source": [ "<a id='toc2_2_3__'></a>\n", @@ -250,12 +251,14 @@ "2. Click **Copy snippet to clipboard**.\n", "\n", "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ], - "id": "9b9203be" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "690dc368", "metadata": {}, + "outputs": [], "source": [ "# Load your model identifier credentials from an `.env` file\n", "\n", @@ -273,13 +276,11 @@ " document=\"documentation\", # requires library >=2.12.0\n", " model=\"..\",\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "690dc368" + ] }, { "cell_type": "markdown", + "id": "a68f6031", "metadata": {}, "source": [ "<a id='toc2_3__'></a>\n", @@ -290,33 +291,33 @@ "\n", "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ], - "id": "a68f6031" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "3fa2d9de", "metadata": {}, + "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import xgboost as xgb" - ], - "execution_count": null, - "outputs": [], - "id": "3fa2d9de" + ] }, { "cell_type": "markdown", + "id": "69a37995", "metadata": {}, "source": [ "<a id='toc3__'></a>\n", "\n", "## Getting to know ValidMind" - ], - "id": "69a37995" + ] }, { "cell_type": "markdown", + "id": "40c9eb24", "metadata": {}, "source": [ "<a id='toc3_1__'></a>\n", @@ -326,21 +327,21 @@ "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", "\n", "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ], - "id": "40c9eb24" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "62842e84", "metadata": {}, + "outputs": [], "source": [ "vm.preview_template()" - ], - "execution_count": null, - "outputs": [], - "id": "62842e84" + ] }, { "cell_type": "markdown", + "id": "6fab1c1c", "metadata": {}, "source": [ "<a id='toc3_2__'></a>\n", @@ -354,21 +355,21 @@ "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", "\n", "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ], - "id": "6fab1c1c" + ] }, { "cell_type": "markdown", + "id": "606d932b", "metadata": {}, "source": [ "<a id='toc4__'></a>\n", "\n", "## Build the example model" - ], - "id": "606d932b" + ] }, { "cell_type": "markdown", + "id": "3d7ad25a", "metadata": {}, "source": [ "<a id='toc4_1__'></a>\n", @@ -381,12 +382,14 @@ "\n", "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ], - "id": "3d7ad25a" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "8ea8188e", "metadata": {}, + "outputs": [], "source": [ "from validmind.datasets.classification import customer_churn\n", "\n", @@ -396,13 +399,11 @@ "\n", "raw_df = customer_churn.load_data()\n", "raw_df.head()" - ], - "execution_count": null, - "outputs": [], - "id": "8ea8188e" + ] }, { "cell_type": "markdown", + "id": "a5ceef72", "metadata": {}, "source": [ "<a id='toc4_2__'></a>\n", @@ -410,12 +411,14 @@ "### Preprocessing the raw dataset\n", "\n", "In this section, we preprocess the raw dataset so it is ready for model training and validation. This includes splitting the data into training, validation, and test subsets to support both model fitting and evaluation on unseen data, and then separating each subset into input features and target labels so the model can learn from customer attributes and predict whether a customer churned." - ], - "id": "a5ceef72" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "9d2bec58", "metadata": {}, + "outputs": [], "source": [ "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", "\n", @@ -423,13 +426,11 @@ "y_train = train_df[customer_churn.target_column]\n", "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", "y_val = validation_df[customer_churn.target_column]" - ], - "execution_count": null, - "outputs": [], - "id": "9d2bec58" + ] }, { "cell_type": "markdown", + "id": "3b9edacf", "metadata": {}, "source": [ "<a id='toc4_3__'></a>\n", @@ -437,12 +438,14 @@ "### Training an XGBoost classifier model\n", "\n", "In this section, we train an XGBoost classifier to predict customer churn, using early stopping to halt training if performance does not improve after 10 rounds and reduce unnecessary fitting. We configure the model to evaluate performance with three complementary metrics: error for incorrect predictions, logloss for prediction confidence, and auc for class separation. The model is trained on the training split and evaluated against the validation split during fitting, while verbose=False keeps the training output concise." - ], - "id": "3b9edacf" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "658447fc", "metadata": {}, + "outputs": [], "source": [ "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", "\n", @@ -456,13 +459,11 @@ " eval_set=[(x_val, y_val)],\n", " verbose=False,\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "658447fc" + ] }, { "cell_type": "markdown", + "id": "c2a6b492", "metadata": {}, "source": [ "<a id='toc5__'></a>\n", @@ -470,12 +471,14 @@ "## Initialize the ValidMind inputs\n", "\n", "We begin by registering the datasets and trained model as ValidMind inputs so they can be referenced consistently throughout the documentation workflow. For the datasets, this means creating ValidMind Dataset objects for the raw, training, and testing data, each with a unique `input_id` for traceability. Where needed, we also provide supporting metadata such as the target column and class labels so tests can interpret the data correctly." - ], - "id": "c2a6b492" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "081548ae", "metadata": {}, + "outputs": [], "source": [ "# Initialize the raw dataset\n", "vm_raw_dataset = vm.init_dataset(\n", @@ -498,13 +501,11 @@ " input_id=\"test_dataset\",\n", " target_column=customer_churn.target_column\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "081548ae" + ] }, { "cell_type": "markdown", + "id": "1ebfda19", "metadata": {}, "source": [ "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", @@ -513,34 +514,36 @@ "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", "\n", "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ], - "id": "1ebfda19" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "6cc5aff8", "metadata": {}, + "outputs": [], "source": [ "# Initialize the model\n", "vm_model = vm.init_model(\n", " model,\n", " input_id=\"model\",\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "6cc5aff8" + ] }, { "cell_type": "markdown", + "id": "48d23cf8", "metadata": {}, "source": [ "Finally, we assign predictions from the trained model to the training and testing datasets. The `assign_predictions()` method links predicted classes and probabilities to each dataset, and can also compute predictions automatically if they are not passed explicitly. This step is what allows ValidMind to run performance and diagnostic tests using the model outputs." - ], - "id": "48d23cf8" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "922baa9d", "metadata": {}, + "outputs": [], "source": [ "vm_train_ds.assign_predictions(\n", " model=vm_model,\n", @@ -548,13 +551,11 @@ "vm_test_ds.assign_predictions(\n", " model=vm_model,\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "922baa9d" + ] }, { "cell_type": "markdown", + "id": "7c9a174d", "metadata": {}, "source": [ "<a id='toc6__'></a>\n", @@ -564,42 +565,42 @@ "In this section, we run the documentation tests defined by the applied template to populate the quantitative parts of the model documentation. The `vm.run_documentation_tests()` function discovers each test-driven block in the template, executes the corresponding tests, and uploads the resulting artifacts to the ValidMind Platform.\n", "\n", "To run the full suite successfully, ValidMind needs to know which model and dataset inputs should be used for each test. This can be done with a shared `inputs` argument when all tests use the same objects, or with a `config` dictionary when individual tests require specific inputs or parameters. In this example, we use the default test parameters and provide the input configuration needed for the demo model." - ], - "id": "7c9a174d" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "47f7e709", "metadata": {}, + "outputs": [], "source": [ "from validmind.utils import preview_test_config\n", "\n", "test_config = customer_churn.get_demo_test_config()\n", "preview_test_config(test_config)" - ], - "execution_count": null, - "outputs": [], - "id": "47f7e709" + ] }, { "cell_type": "markdown", + "id": "3f22d37b", "metadata": {}, "source": [ "Once the configuration is prepared, we pass it to `vm.run_documentation_tests()` and execute the full suite. The returned `full_suite` object contains the test results and represents the quantitative documentation that has been generated for the model." - ], - "id": "3f22d37b" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "999be7fe", "metadata": {}, + "outputs": [], "source": [ "full_suite = vm.run_documentation_tests(config=test_config)" - ], - "execution_count": null, - "outputs": [], - "id": "999be7fe" + ] }, { "cell_type": "markdown", + "id": "5d531744", "metadata": {}, "source": [ "<a id='toc7__'></a>\n", @@ -609,11 +610,11 @@ "In addition to documenting quantitative results through tests, ValidMind now supports programmatic generation of qualitative content for the text blocks in a model documentation template through `vm.run_text_generation()`. This function allows you to generate AI-assisted text for a specific content block directly from a notebook and then log it back to the corresponding section of the document. As a result, you can populate qualitative sections without switching to the UI to write text manually or trigger generation one section at a time.\n", "\n", "In the next sections, we’ll walk through the main ways to use this functionality. We’ll start by generating text for a single content block with the default behavior, then show how to customize the output with a prompt, how to control the context used for generation by selecting specific sections, and finally how to scale the same pattern across all text blocks in the document." - ], - "id": "5d531744" + ] }, { "cell_type": "markdown", + "id": "899c8553", "metadata": {}, "source": [ "<a id='toc7_1__'></a>\n", @@ -621,33 +622,33 @@ "### Generate text for a single content block\n", "\n", "First, we’ll use `vm.run_text_generation()` to generate qualitative text for a single documentation block. By providing a `content_id`, you can target the exact text placeholder you want to populate and let ValidMind generate content using the current document context. The helper `vm.get_content_ids()` is useful for inspecting which content blocks are available in the active template, making it easier to identify the IDs you can use when generating and logging text programmatically." - ], - "id": "899c8553" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "85cc552f", "metadata": {}, + "outputs": [], "source": [ "vm.get_content_ids()" - ], - "execution_count": null, - "outputs": [], - "id": "85cc552f" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "26fcddf9", "metadata": {}, + "outputs": [], "source": [ "vm.run_text_generation(\n", " content_id=\"dataset_summary_text\",\n", ").log()" - ], - "execution_count": null, - "outputs": [], - "id": "26fcddf9" + ] }, { "cell_type": "markdown", + "id": "caff6490", "metadata": {}, "source": [ "<a id='toc7_2__'></a>\n", @@ -655,12 +656,14 @@ "### Customize the prompt\n", "\n", "Next, we’ll customize the generated output by passing a `prompt` to `vm.run_text_generation()`. This makes it possible to guide not just the subject of the generated text, but also its structure, tone, level of detail, and presentation format. In practice, this allows you to tailor the output for different documentation needs, such as producing a short narrative summary, a more structured section, or content written for a specific audience, while still relying on the same underlying document context for generation." - ], - "id": "caff6490" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "52165b98", "metadata": {}, + "outputs": [], "source": [ "prompt = \"\"\"\n", "Use exactly this structure:\n", @@ -684,26 +687,24 @@ "<h3>Overall Assessment</h3>\n", "<p>End with a short balanced conclusion on the dataset's suitability for model development and evaluation.</p>\n", "\"\"\"" - ], - "execution_count": null, - "outputs": [], - "id": "52165b98" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "fbf10ad9", "metadata": {}, + "outputs": [], "source": [ "vm.run_text_generation(\n", " content_id=\"dataset_summary_text\",\n", " prompt=prompt,\n", ").log()" - ], - "execution_count": null, - "outputs": [], - "id": "fbf10ad9" + ] }, { "cell_type": "markdown", + "id": "99a0740e", "metadata": {}, "source": [ "<a id='toc7_3__'></a>\n", @@ -711,34 +712,34 @@ "### Pass section-specific context\n", "\n", "Then, we’ll control the `context` used for generation by passing a selected set of content IDs to `vm.run_text_generation()`. Rather than relying on the full document, this lets you focus the model on the most relevant parts of the documentation for a given text block. In practice, that means you can generate more targeted qualitative content by choosing which existing test and text blocks should inform the output." - ], - "id": "99a0740e" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "43cf0e7d", "metadata": {}, + "outputs": [], "source": [ "vm.get_content_ids(\"data_description\")" - ], - "execution_count": null, - "outputs": [], - "id": "43cf0e7d" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "1e1a919e", "metadata": {}, + "outputs": [], "source": [ "vm.run_text_generation(\n", " content_id=\"dataset_summary_text\",\n", " context={\"content_ids\": vm.get_content_ids(\"data_description\")},\n", ").log()" - ], - "execution_count": null, - "outputs": [], - "id": "1e1a919e" + ] }, { "cell_type": "markdown", + "id": "701a0323", "metadata": {}, "source": [ "<a id='toc7_4__'></a>\n", @@ -746,24 +747,24 @@ "### Append a new text block to a section\n", "\n", "Sometimes you may want to generate text for a `content_id` that is not already defined in the template. In that case, you can still generate the text with `vm.run_text_generation()` and then use `.log(section_id=...)` to tell ValidMind where that new text block should be placed in the document. " - ], - "id": "701a0323" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "6a9ba924", "metadata": {}, + "outputs": [], "source": [ "vm.run_text_generation(\n", " content_id=\"intended_use\",\n", " section_id=\"intended_use\",\n", ").log()" - ], - "execution_count": null, - "outputs": [], - "id": "6a9ba924" + ] }, { "cell_type": "markdown", + "id": "6e032b79", "metadata": {}, "source": [ "<a id='toc7_5__'></a>\n", @@ -791,32 +792,32 @@ " ```\n", "\n", " Each `<content-id>` represents a documentation text block to populate. Use `section_id` when the block should be inserted into a specific section, `prompt` when you want to shape the output more explicitly, and `context.content_ids` when you want the generation step to focus on selected parts of the document. In this notebook, `text_config` comes from `customer_churn.get_demo_text_config()`, which provides the demo setup for the customer churn example." - ], - "id": "6e032b79" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "a97bb129", "metadata": {}, + "outputs": [], "source": [ "text_config = customer_churn.get_demo_text_config()\n", "preview_test_config(text_config)" - ], - "execution_count": null, - "outputs": [], - "id": "a97bb129" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "aff42702", "metadata": {}, + "outputs": [], "source": [ "results = vm.generate_documentation_text(config=text_config)" - ], - "execution_count": null, - "outputs": [], - "id": "aff42702" + ] }, { "cell_type": "markdown", + "id": "03b6b875", "metadata": {}, "source": [ "<a id='toc8__'></a>\n", @@ -831,11 +832,11 @@ "- [x] Customize generated output by passing a prompt\n", "- [x] Control generation context by selecting specific sections of the document\n", "- [x] Use a configuration-driven workflow to generate qualitative content across the document with `vm.generate_documentation_text()`" - ], - "id": "03b6b875" + ] }, { "cell_type": "markdown", + "id": "3db3c328", "metadata": {}, "source": [ "<a id='toc9__'></a>\n", @@ -843,11 +844,11 @@ "## Next steps\n", "\n", "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ], - "id": "3db3c328" + ] }, { "cell_type": "markdown", + "id": "d7bd8df8", "metadata": {}, "source": [ "<a id='toc9_1__'></a>\n", @@ -859,11 +860,11 @@ "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", "\n", "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ], - "id": "d7bd8df8" + ] }, { "cell_type": "markdown", + "id": "c0951457", "metadata": {}, "source": [ "<a id='toc9_2__'></a>\n", @@ -882,11 +883,11 @@ "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", "\n", "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ], - "id": "c0951457" + ] }, { "cell_type": "markdown", + "id": "24532182", "metadata": {}, "source": [ "<a id='toc10__'></a>\n", @@ -896,21 +897,21 @@ "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", "\n", "Retrieve the information for the currently installed version of ValidMind:" - ], - "id": "24532182" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "2e796c43", "metadata": {}, + "outputs": [], "source": [ "%pip show validmind" - ], - "execution_count": null, - "outputs": [], - "id": "2e796c43" + ] }, { "cell_type": "markdown", + "id": "713a6722", "metadata": {}, "source": [ "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", @@ -918,19 +919,19 @@ "```bash\n", "%pip install --upgrade validmind\n", "```" - ], - "id": "713a6722" + ] }, { "cell_type": "markdown", + "id": "84a65def", "metadata": {}, "source": [ "You may need to restart your kernel after running the upgrade package for changes to be applied." - ], - "id": "84a65def" + ] }, { "cell_type": "markdown", + "id": "copyright-18d82030e09942c4953248e9bf432249", "metadata": {}, "source": [ "<!-- VALIDMIND COPYRIGHT -->\n", @@ -942,8 +943,7 @@ "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ], - "id": "copyright-18d82030e09942c4953248e9bf432249" + ] } ], "metadata": { From e5caed25ad477ec666fd3058f12287479a890e17 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Mon, 25 May 2026 10:48:33 -0700 Subject: [PATCH 06/13] Updating key concepts for SR 26-2 --- notebooks/code_sharing/r/r_custom_tests.Rmd | 2 +- .../dataset_inputs/configure_dataset_features.ipynb | 2 +- .../dataset_inputs/load_datasets_predictions.ipynb | 2 +- notebooks/how_to/metrics/log_metrics_over_time.ipynb | 2 +- .../how_to/qualitative_text/qualitative_text_generation.ipynb | 2 +- .../how_to/tests/custom_tests/implement_custom_tests.ipynb | 2 +- notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb | 2 +- notebooks/how_to/tests/explore_tests/explore_tests.ipynb | 2 +- .../how_to/tests/run_tests/1-run_dataset-based_tests.ipynb | 2 +- notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb | 2 +- .../tests/run_tests/configure_tests/enable_pii_detection.ipynb | 2 +- .../run_tests_that_require_multiple_datasets.ipynb | 2 +- .../document_multiple_results_for_the_same_test.ipynb | 2 +- .../documentation_tests/run_documentation_sections.ipynb | 2 +- .../run_documentation_tests_with_config.ipynb | 2 +- notebooks/quickstart/quickstart_documentation.ipynb | 2 +- notebooks/quickstart/quickstart_validation.ipynb | 2 +- .../templates/about-validmind/_about-validmind-developers.ipynb | 2 +- .../templates/about-validmind/_about-validmind-monitoring.ipynb | 2 +- .../templates/about-validmind/_about-validmind-validators.ipynb | 2 +- notebooks/tutorials/development/1-set_up_validmind.ipynb | 2 +- .../validation/1-set_up_validmind_for_validation.ipynb | 2 +- notebooks/use_cases/agents/document_agentic_ai.ipynb | 2 +- .../capital_markets/quickstart_option_pricing_models.ipynb | 2 +- .../quickstart_option_pricing_models_quantlib.ipynb | 2 +- .../code_explainer/quickstart_code_explainer_demo.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_executive.ipynb | 2 +- .../credit_risk/application_scorecard_full_suite.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_with_bias.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_with_ml.ipynb | 2 +- .../credit_risk/document_excel_application_scorecard.ipynb | 2 +- notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb | 2 +- .../application_scorecard_ongoing_monitoring.ipynb | 2 +- .../quickstart_customer_churn_ongoing_monitoring.ipynb | 2 +- .../time_series/quickstart_time_series_full_suite.ipynb | 2 +- .../time_series/quickstart_time_series_high_code.ipynb | 2 +- .../use_cases/validation/validate_application_scorecard.ipynb | 2 +- 37 files changed, 37 insertions(+), 37 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index cb09c28f9..4ba19e3c0 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -43,7 +43,7 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management. -**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a "quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates." Within ValidMind, a model is type of record tracked in the inventory. +**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory. **documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index ede3bdfb7..79ff3f066 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -82,7 +82,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index 02f339f24..99f792370 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -102,7 +102,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index c1465096b..0de529828 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -99,7 +99,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 4b9572b5e..1791a4caa 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -117,7 +117,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index 694296d08..b1ec3ee5a 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -94,7 +94,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index 61591131e..1402a144b 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -74,7 +74,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 8df3dd849..ba4214154 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -75,7 +75,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 2baaa881d..7b10d32e3 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -107,7 +107,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index 6b6e01a44..07107e7e8 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -119,7 +119,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 210daf5df..62eebe37e 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -110,7 +110,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index d6d1fa2da..2acdef252 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -92,7 +92,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 6cd3967a7..3b168aae4 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -99,7 +99,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 4724da3ab..65c989162 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -90,7 +90,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index bd01a1439..513b9ddcf 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -94,7 +94,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index 223f29671..0e4fc4879 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -141,7 +141,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index 0f262dc39..b5406154d 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -147,7 +147,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index 5bed58220..1ccc6d26b 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -46,7 +46,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index e604d49db..60b227ad9 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -46,7 +46,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index 5b646da58..6662bb51b 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -46,7 +46,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index 2396ea5cf..20c994c2e 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -125,7 +125,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index 0f95fc0b4..cbf31d91a 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -127,7 +127,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index ebce87e8b..3f7f5341c 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -148,7 +148,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index a007317e7..c806980d1 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -105,7 +105,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index 2057d819c..919ccc1c0 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -144,7 +144,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index e5f487771..8019da97a 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -108,7 +108,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 6584ceb06..381274c00 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -87,7 +87,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index bd4ade621..55b3e89fa 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -101,7 +101,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index 75b2d030a..1480e3077 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -102,7 +102,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index 909cb355e..ad3bccddd 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -114,7 +114,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index c8854999b..729cd4555 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -98,7 +98,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index a7f7f2256..4207cf0c3 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -83,7 +83,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index 78c8a4da4..740cd0a82 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -94,7 +94,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index a5ea5e9a6..b92c3c45f 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -92,7 +92,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 5aa9cee23..a311b525b 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -93,7 +93,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index e261808c1..834337921 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -94,7 +94,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index d9f0bd1e6..76bc6b3d8 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -139,7 +139,7 @@ "\n", "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 11-7](https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm) defines a model as a \"quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.\" Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", From 52801405f4bd1173146a246d77dd667bf13e732d Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 11:38:50 -0700 Subject: [PATCH 07/13] Edit #1 --- notebooks/code_sharing/r/r_custom_tests.Rmd | 4 +- .../configure_dataset_features.ipynb | 50 +++++++++---------- .../load_datasets_predictions.ipynb | 4 +- .../metrics/log_metrics_over_time.ipynb | 4 +- .../qualitative_text_generation.ipynb | 4 +- .../custom_tests/implement_custom_tests.ipynb | 4 +- .../explore_tests/explore_test_suites.ipynb | 4 +- .../tests/explore_tests/explore_tests.ipynb | 4 +- .../run_tests/1-run_dataset-based_tests.ipynb | 4 +- .../run_tests/2-run_comparison_tests.ipynb | 4 +- .../enable_pii_detection.ipynb | 4 +- ...tests_that_require_multiple_datasets.ipynb | 4 +- ...t_multiple_results_for_the_same_test.ipynb | 4 +- .../run_documentation_sections.ipynb | 4 +- .../run_documentation_tests_with_config.ipynb | 4 +- .../quickstart/quickstart_documentation.ipynb | 4 +- .../quickstart/quickstart_validation.ipynb | 4 +- .../_about-validmind-developers.ipynb | 4 +- .../_about-validmind-monitoring.ipynb | 4 +- .../_about-validmind-validators.ipynb | 4 +- .../development/1-set_up_validmind.ipynb | 4 +- .../1-set_up_validmind_for_validation.ipynb | 4 +- .../agents/document_agentic_ai.ipynb | 4 +- .../quickstart_option_pricing_models.ipynb | 4 +- ...start_option_pricing_models_quantlib.ipynb | 4 +- .../quickstart_code_explainer_demo.ipynb | 4 +- .../application_scorecard_executive.ipynb | 4 +- .../application_scorecard_full_suite.ipynb | 4 +- .../application_scorecard_with_bias.ipynb | 4 +- .../application_scorecard_with_ml.ipynb | 4 +- ...document_excel_application_scorecard.ipynb | 4 +- .../nlp_and_llm/prompt_validation_demo.ipynb | 4 +- ...ication_scorecard_ongoing_monitoring.ipynb | 4 +- ...rt_customer_churn_ongoing_monitoring.ipynb | 4 +- .../quickstart_time_series_full_suite.ipynb | 4 +- .../quickstart_time_series_high_code.ipynb | 4 +- .../validate_application_scorecard.ipynb | 4 +- 37 files changed, 97 insertions(+), 97 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index 4ba19e3c0..91a74a2ae 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -41,9 +41,9 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- ### Key concepts -**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management. +**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management. -**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory. +**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory. **documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index 79ff3f066..ff579a44d 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -80,9 +80,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", @@ -132,12 +132,12 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "%pip install -q validmind" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -209,7 +209,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Load your model identifier credentials from an `.env` file\n", "\n", @@ -227,9 +229,7 @@ " # model=\"...\",\n", " document=\"documentation\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -242,7 +242,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "%matplotlib inline\n", "\n", @@ -254,9 +256,7 @@ "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", "\n", "df = demo_dataset.load_data()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -278,7 +278,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "feature_columns = [\n", " \"CreditScore\",\n", @@ -297,9 +299,7 @@ " target_column=demo_dataset.target_column,\n", " feature_columns=feature_columns,\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -328,7 +328,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "vm_dataset = vm.init_dataset(\n", " dataset=df,\n", @@ -340,9 +342,7 @@ " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", " inputs={\"dataset\": vm_dataset},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -353,7 +353,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "vm_dataset = vm.init_dataset(\n", " dataset=df,\n", @@ -366,9 +368,7 @@ " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", " inputs={\"dataset\": vm_dataset},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -418,12 +418,12 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "%pip show validmind" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -445,6 +445,7 @@ }, { "cell_type": "markdown", + "id": "copyright-32870f8bce7f4ed0903136a69d02b421", "metadata": {}, "source": [ "<!-- VALIDMIND COPYRIGHT -->\n", @@ -456,8 +457,7 @@ "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ], - "id": "copyright-32870f8bce7f4ed0903136a69d02b421" + ] } ], "metadata": { diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index 99f792370..a8ffd38c4 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -100,9 +100,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 0de529828..92d90bdb4 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -97,9 +97,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 1791a4caa..85e581b1e 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -115,9 +115,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index b1ec3ee5a..473549bad 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -92,9 +92,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index 1402a144b..712d52f2c 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -72,9 +72,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index ba4214154..70d481f9e 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -73,9 +73,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 7b10d32e3..0f4b30e5a 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -105,9 +105,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index 07107e7e8..da1c0683c 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -117,9 +117,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 62eebe37e..8b4be0edf 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -108,9 +108,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 2acdef252..74c23f684 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -90,9 +90,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 3b168aae4..cebe8ce51 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -97,9 +97,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 65c989162..21782fbab 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -88,9 +88,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index 513b9ddcf..38c9c842d 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -92,9 +92,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index 0e4fc4879..7f1e18448 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -139,9 +139,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index b5406154d..d0e4690e5 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -145,9 +145,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index 1ccc6d26b..de8d55e43 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -44,9 +44,9 @@ "source": [ "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index 60b227ad9..ac901be6e 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -44,9 +44,9 @@ "source": [ "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index 6662bb51b..ff8540af5 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -44,9 +44,9 @@ "source": [ "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index 20c994c2e..1a49bf8de 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -123,9 +123,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index cbf31d91a..dc0239e87 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -125,9 +125,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index 3f7f5341c..2b96788f7 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -146,9 +146,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index c806980d1..f89ae9488 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -103,9 +103,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index 919ccc1c0..9faf578af 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -142,9 +142,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 8019da97a..6fd8a2a23 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -106,9 +106,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 381274c00..6303ad2de 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -85,9 +85,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index 55b3e89fa..846c83778 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -99,9 +99,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index 1480e3077..c5abce597 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -100,9 +100,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index ad3bccddd..be8c56510 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -112,9 +112,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 729cd4555..d5352cba9 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -96,9 +96,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index 4207cf0c3..adde1f2cb 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -81,9 +81,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index 740cd0a82..fa68c0bbb 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -92,9 +92,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index b92c3c45f..0a96ce80c 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -90,9 +90,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index a311b525b..25e6c6ce2 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -91,9 +91,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index 834337921..e0ee9e1b5 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -92,9 +92,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index 76bc6b3d8..9c65d1161 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -137,9 +137,9 @@ "\n", "### Key concepts\n", "\n", - "**record**: Tools tracked in the ValidMind inventory, such as models. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", "\n", - "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is type of record tracked in the inventory.\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", From 6b5b368577fddf434aa219898bcdaf16811aca0b Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 11:51:32 -0700 Subject: [PATCH 08/13] Edit #2 --- notebooks/code_sharing/r/r_custom_tests.Rmd | 10 +++++----- .../dataset_inputs/configure_dataset_features.ipynb | 10 +++++----- .../dataset_inputs/load_datasets_predictions.ipynb | 10 +++++----- .../data_and_datasets/use_dataset_model_objects.ipynb | 6 +++--- notebooks/how_to/metrics/log_metrics_over_time.ipynb | 10 +++++----- .../qualitative_text/qualitative_text_generation.ipynb | 10 +++++----- .../tests/custom_tests/implement_custom_tests.ipynb | 10 +++++----- .../tests/explore_tests/explore_test_suites.ipynb | 10 +++++----- .../how_to/tests/explore_tests/explore_tests.ipynb | 10 +++++----- .../tests/run_tests/1-run_dataset-based_tests.ipynb | 10 +++++----- .../tests/run_tests/2-run_comparison_tests.ipynb | 10 +++++----- .../configure_tests/enable_pii_detection.ipynb | 10 +++++----- .../run_tests_that_require_multiple_datasets.ipynb | 10 +++++----- .../document_multiple_results_for_the_same_test.ipynb | 10 +++++----- .../run_documentation_sections.ipynb | 10 +++++----- .../run_documentation_tests_with_config.ipynb | 10 +++++----- notebooks/quickstart/quickstart_documentation.ipynb | 10 +++++----- notebooks/quickstart/quickstart_validation.ipynb | 6 +++--- .../about-validmind/_about-validmind-developers.ipynb | 10 +++++----- .../about-validmind/_about-validmind-monitoring.ipynb | 6 +++--- .../about-validmind/_about-validmind-validators.ipynb | 6 +++--- .../tutorials/development/1-set_up_validmind.ipynb | 10 +++++----- .../validation/1-set_up_validmind_for_validation.ipynb | 6 +++--- notebooks/use_cases/agents/document_agentic_ai.ipynb | 10 +++++----- .../quickstart_option_pricing_models.ipynb | 10 +++++----- .../quickstart_option_pricing_models_quantlib.ipynb | 10 +++++----- .../quickstart_code_explainer_demo.ipynb | 10 +++++----- .../credit_risk/application_scorecard_executive.ipynb | 10 +++++----- .../credit_risk/application_scorecard_full_suite.ipynb | 10 +++++----- .../credit_risk/application_scorecard_with_bias.ipynb | 10 +++++----- .../credit_risk/application_scorecard_with_ml.ipynb | 10 +++++----- .../document_excel_application_scorecard.ipynb | 10 +++++----- .../use_cases/nlp_and_llm/prompt_validation_demo.ipynb | 10 +++++----- .../application_scorecard_ongoing_monitoring.ipynb | 6 +++--- .../quickstart_customer_churn_ongoing_monitoring.ipynb | 6 +++--- .../quickstart_time_series_full_suite.ipynb | 10 +++++----- .../time_series/quickstart_time_series_high_code.ipynb | 10 +++++----- .../validation/validate_application_scorecard.ipynb | 6 +++--- 38 files changed, 174 insertions(+), 174 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index 91a74a2ae..c8fca9896 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -45,23 +45,23 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory. -**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. +**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. **document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. -**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. +**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. -**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates. +**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates. **test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html)) **metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts. -**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform. +**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform. **inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following: - - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind. + - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind. - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset). - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests. - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html)) diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index ff579a44d..ca35af7eb 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -84,23 +84,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index a8ffd38c4..e9d3f7494 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -104,23 +104,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb index 190f8a50a..3dfee8bbb 100644 --- a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb +++ b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb @@ -91,7 +91,7 @@ "\n", "### Key concepts\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", " - **dataset-based test**\n", "\n", @@ -107,11 +107,11 @@ "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 92d90bdb4..0e78992a5 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -101,23 +101,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 85e581b1e..be0389bb2 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -119,23 +119,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index 473549bad..603186020 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -96,23 +96,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index 712d52f2c..2cc6bee49 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -76,23 +76,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 70d481f9e..ed96d41d2 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -77,23 +77,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 0f4b30e5a..4cb4b27af 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -109,23 +109,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index da1c0683c..e50f6cd44 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -121,23 +121,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 8b4be0edf..e9be161fe 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -112,23 +112,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 74c23f684..074ef0a66 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -94,23 +94,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index cebe8ce51..3bd92374f 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -101,23 +101,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 21782fbab..ae542e2ff 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -92,23 +92,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index 38c9c842d..d1d206f3a 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -96,23 +96,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index 7f1e18448..a658650c9 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -143,23 +143,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index d0e4690e5..26fdb1b66 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -157,17 +157,17 @@ "\n", "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index de8d55e43..6a48fbec9 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -48,23 +48,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index ac901be6e..0929aea4e 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -54,17 +54,17 @@ "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index ff8540af5..a8c50fe1c 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -56,17 +56,17 @@ "\n", "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index 1a49bf8de..da1f59c24 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -127,23 +127,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index dc0239e87..e8797be51 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -137,17 +137,17 @@ "\n", "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index 2b96788f7..ee1a247c8 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -150,23 +150,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index f89ae9488..a8333a2c7 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -107,23 +107,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index 9faf578af..4ce1ace78 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -146,23 +146,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 6fd8a2a23..3dd777057 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -110,23 +110,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 6303ad2de..e72882570 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -89,23 +89,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index 846c83778..0a0b266c5 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -103,23 +103,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index c5abce597..ca089e069 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -104,23 +104,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index be8c56510..4dc5b89a4 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -116,23 +116,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index d5352cba9..9c32c138c 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -100,23 +100,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index adde1f2cb..91aab5e6b 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -85,23 +85,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index fa68c0bbb..f0a37cd21 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -102,17 +102,17 @@ "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index 0a96ce80c..f37c3f7d4 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -100,17 +100,17 @@ "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 25e6c6ce2..503d4a922 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -95,23 +95,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index e0ee9e1b5..a22357ab9 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -96,23 +96,23 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**documentation, model documentation**: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", - "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index 9c65d1161..6f821afee 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -149,17 +149,17 @@ "\n", "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", "\n", - "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", "\n", "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", "\n", "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", "\n", - "**custom test**: Functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", "\n", "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", - " - **model**: A single record (such as a model) that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", From 7db43b69455bfc0b4493143cd80323ad8f63a46b Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 11:55:25 -0700 Subject: [PATCH 09/13] Edit #3 --- .../templates/about-validmind/_about-validmind-monitoring.ipynb | 2 +- .../application_scorecard_ongoing_monitoring.ipynb | 2 +- .../quickstart_customer_churn_ongoing_monitoring.ipynb | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index 0929aea4e..1ec766bee 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -48,7 +48,7 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index f0a37cd21..124244318 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -96,7 +96,7 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index f37c3f7d4..dc5995422 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -94,7 +94,7 @@ "\n", "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", "\n", - "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record (such as a model)'s performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", From 6d16b57a2dd88063532e7b889d4cb8d07f7912d7 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 11:56:26 -0700 Subject: [PATCH 10/13] ... --- .../tests/explore_tests/explore_tests.ipynb | 158 ++++++++++-------- 1 file changed, 88 insertions(+), 70 deletions(-) diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index ed96d41d2..592abd8b2 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -120,12 +120,12 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "%pip install -q validmind" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -145,7 +145,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from validmind.tests import (\n", " list_tests,\n", @@ -153,9 +155,7 @@ " list_tags,\n", " list_tasks_and_tags,\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -172,14 +172,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tests()" - ], "execution_count": null, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -2356,8 +2352,14 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x38000a670>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tests()" ] }, { @@ -2370,7 +2372,7 @@ "\n", "Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.\n", "\n", - "Understanding `task` types helps you filter tests that match your record's (such as a model) objective. For example:\n", + "Understanding `task` types helps you filter tests that match your record's objective. For example:\n", "\n", "- **classification:** Works with Classification Models and Datasets.\n", "- **regression:** Works with Regression Models and Datasets.\n", @@ -2380,14 +2382,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tasks()" - ], "execution_count": 3, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/plain": [ "['text_qa',\n", @@ -2405,8 +2403,14 @@ " 'monitoring',\n", " 'text_generation']" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tasks()" ] }, { @@ -2426,14 +2430,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tags()" - ], "execution_count": 4, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/plain": [ "['senstivity_analysis',\n", @@ -2497,8 +2497,14 @@ " 'categorical_data',\n", " 'data_analysis']" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tags()" ] }, { @@ -2510,14 +2516,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tasks_and_tags()" - ], "execution_count": null, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -2598,8 +2600,14 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x38000adc0>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tasks_and_tags()" ] }, { @@ -2617,14 +2625,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tests(filter=\"sklearn\")" - ], "execution_count": 6, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -3129,8 +3133,14 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x1052e6790>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tests(filter=\"sklearn\")" ] }, { @@ -3142,14 +3152,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tests(task=\"classification\")" - ], "execution_count": 7, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -4017,8 +4023,14 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x10516c880>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tests(task=\"classification\")" ] }, { @@ -4030,14 +4042,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tests(tags=[\"model_performance\", \"visualization\"])" - ], "execution_count": null, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -4146,8 +4154,14 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x36a280f40>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tests(tags=[\"model_performance\", \"visualization\"])" ] }, { @@ -4161,16 +4175,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "list_tests(filter=\"sklearn\",\n", - " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", - ")" - ], "execution_count": null, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/html": [ "<style type=\"text/css\">\n", @@ -4268,8 +4276,16 @@ "text/plain": [ "<pandas.io.formats.style.Styler at 0x380009c40>" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "list_tests(filter=\"sklearn\",\n", + " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", + ")" ] }, { @@ -4287,15 +4303,10 @@ }, { "cell_type": "code", - "metadata": {}, - "source": [ - "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", - "text_summarization_tests" - ], "execution_count": null, + "metadata": {}, "outputs": [ { - "output_type": "execute_result", "data": { "text/plain": [ "['validmind.data_validation.DatasetDescription',\n", @@ -4339,8 +4350,15 @@ " 'validmind.prompt_validation.Robustness',\n", " 'validmind.prompt_validation.Specificity']" ] - } + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } + ], + "source": [ + "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", + "text_summarization_tests" ] }, { @@ -4385,12 +4403,12 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "%pip show validmind" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -4412,6 +4430,7 @@ }, { "cell_type": "markdown", + "id": "copyright-fb6994d364c54669b356f7a2278d6480", "metadata": {}, "source": [ "<!-- VALIDMIND COPYRIGHT -->\n", @@ -4423,8 +4442,7 @@ "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ], - "id": "copyright-fb6994d364c54669b356f7a2278d6480" + ] } ], "metadata": { @@ -4448,4 +4466,4 @@ }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +} From 562f3abcf6e3f5b7d6bfc425bd407937e2360a6a Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 12:16:42 -0700 Subject: [PATCH 11/13] hdfkg --- notebooks/code_sharing/r/r_custom_tests.Rmd | 4 +- .../configure_dataset_features.ipynb | 4 +- .../load_datasets_predictions.ipynb | 4 +- .../use_dataset_model_objects.ipynb | 2 +- .../metrics/log_metrics_over_time.ipynb | 4 +- .../qualitative_text_generation.ipynb | 4 +- .../custom_tests/implement_custom_tests.ipynb | 4 +- .../explore_tests/explore_test_suites.ipynb | 4 +- .../tests/explore_tests/explore_tests.ipynb | 4 +- .../run_tests/1-run_dataset-based_tests.ipynb | 4 +- .../run_tests/2-run_comparison_tests.ipynb | 4 +- .../enable_pii_detection.ipynb | 4 +- ...tests_that_require_multiple_datasets.ipynb | 4 +- ...t_multiple_results_for_the_same_test.ipynb | 4 +- .../run_documentation_sections.ipynb | 4 +- .../run_documentation_tests_with_config.ipynb | 4 +- .../quickstart/quickstart_documentation.ipynb | 292 +++++++++--------- .../quickstart/quickstart_validation.ipynb | 4 +- .../_about-validmind-developers.ipynb | 4 +- .../_about-validmind-monitoring.ipynb | 4 +- .../_about-validmind-validators.ipynb | 4 +- .../development/1-set_up_validmind.ipynb | 4 +- .../1-set_up_validmind_for_validation.ipynb | 4 +- .../agents/document_agentic_ai.ipynb | 4 +- .../quickstart_option_pricing_models.ipynb | 4 +- ...start_option_pricing_models_quantlib.ipynb | 4 +- .../quickstart_code_explainer_demo.ipynb | 4 +- .../application_scorecard_executive.ipynb | 4 +- .../application_scorecard_full_suite.ipynb | 4 +- .../application_scorecard_with_bias.ipynb | 4 +- .../application_scorecard_with_ml.ipynb | 4 +- ...document_excel_application_scorecard.ipynb | 4 +- .../nlp_and_llm/prompt_validation_demo.ipynb | 4 +- ...ication_scorecard_ongoing_monitoring.ipynb | 4 +- ...rt_customer_churn_ongoing_monitoring.ipynb | 4 +- .../quickstart_time_series_full_suite.ipynb | 4 +- .../quickstart_time_series_high_code.ipynb | 4 +- .../validate_application_scorecard.ipynb | 4 +- 38 files changed, 219 insertions(+), 219 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index c8fca9896..90d2ec4c6 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -47,7 +47,7 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. -**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. +**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. **documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. @@ -62,7 +62,7 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following: - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind. - - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset). + - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset). - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests. - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html)) diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index ca35af7eb..7751e9ef6 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -86,7 +86,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -101,7 +101,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index e9d3f7494..d1dcfa6b3 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -121,7 +121,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb index 3dfee8bbb..7102ad6de 100644 --- a/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb +++ b/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb @@ -112,7 +112,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 0e78992a5..660859b70 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -118,7 +118,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index be0389bb2..581a1e58d 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -121,7 +121,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -136,7 +136,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index 603186020..8cd46fba8 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -113,7 +113,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index 2cc6bee49..b7b4a7fb9 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -78,7 +78,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -93,7 +93,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 592abd8b2..374734d04 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -79,7 +79,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -94,7 +94,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 4cb4b27af..620b4a80f 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -111,7 +111,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -126,7 +126,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index e50f6cd44..de5fc3434 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -123,7 +123,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -138,7 +138,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index e9be161fe..313152c51 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -114,7 +114,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -129,7 +129,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 074ef0a66..0031dfbbb 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -96,7 +96,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -111,7 +111,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 3bd92374f..f27a82691 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -118,7 +118,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index ae542e2ff..995d61a9f 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -94,7 +94,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -109,7 +109,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index d1d206f3a..8a20a36e8 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -113,7 +113,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index a658650c9..81750a13b 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -2,6 +2,7 @@ "cells": [ { "cell_type": "markdown", + "id": "7b021b0d", "metadata": {}, "source": [ "# Quickstart for documentation\n", @@ -14,11 +15,11 @@ "2. Split the datasets and initialize them for use with ValidMind\n", "3. Initialize a ValidMind model object for use with testing\n", "4. Run a full suite of tests as defined by our documentation template, which will send the results of those tests to the ValidMind Platform" - ], - "id": "7b021b0d" + ] }, { "cell_type": "markdown", + "id": "167aef58", "metadata": {}, "source": [ "::: {.content-hidden when-format=\"html\"}\n", @@ -67,11 +68,11 @@ "\tmaxLevel=4\n", "\t/jn-toc-notebook-config -->\n", "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ], - "id": "167aef58" + ] }, { "cell_type": "markdown", + "id": "1cce526f", "metadata": {}, "source": [ "<a id='toc1__'></a>\n", @@ -84,11 +85,11 @@ "\n", "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ], - "id": "1cce526f" + ] }, { "cell_type": "markdown", + "id": "f9b5eac2", "metadata": {}, "source": [ "<a id='toc2__'></a>\n", @@ -98,11 +99,11 @@ "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", "\n", "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ], - "id": "f9b5eac2" + ] }, { "cell_type": "markdown", + "id": "650236de", "metadata": {}, "source": [ "<a id='toc2_1__'></a>\n", @@ -112,11 +113,11 @@ "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", "\n", "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ], - "id": "650236de" + ] }, { "cell_type": "markdown", + "id": "b9d9d4cf", "metadata": {}, "source": [ "<a id='toc2_2__'></a>\n", @@ -128,11 +129,11 @@ "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", "<br></br>\n", "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ], - "id": "b9d9d4cf" + ] }, { "cell_type": "markdown", + "id": "59b308f7", "metadata": {}, "source": [ "<a id='toc2_3__'></a>\n", @@ -145,7 +146,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -160,28 +161,28 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", "\n", "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ], - "id": "59b308f7" + ] }, { "cell_type": "markdown", + "id": "61b5cbeb", "metadata": {}, "source": [ "<a id='toc3__'></a>\n", "\n", "## Setting up" - ], - "id": "61b5cbeb" + ] }, { "cell_type": "markdown", + "id": "0f08166e", "metadata": {}, "source": [ "<a id='toc3_1__'></a>\n", @@ -193,31 +194,31 @@ "Python 3.8 <= x <= 3.14</div>\n", "\n", "To install the library:" - ], - "id": "0f08166e" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "d1f6dbed", "metadata": {}, + "outputs": [], "source": [ "%pip install -q validmind" - ], - "execution_count": null, - "outputs": [], - "id": "d1f6dbed" + ] }, { "cell_type": "markdown", + "id": "1bf4e4cb", "metadata": {}, "source": [ "<a id='toc3_2__'></a>\n", "\n", "### Initialize the ValidMind Library" - ], - "id": "1bf4e4cb" + ] }, { "cell_type": "markdown", + "id": "cb6e369b", "metadata": {}, "source": [ "<a id='toc3_2_1__'></a>\n", @@ -237,11 +238,11 @@ "5. Select your own name under the **RECORD OWNER** drop-down.\n", "\n", "6. Click **Register Model** to add the model to your inventory." - ], - "id": "cb6e369b" + ] }, { "cell_type": "markdown", + "id": "7167d002", "metadata": {}, "source": [ "<a id='toc3_2_2__'></a>\n", @@ -257,11 +258,11 @@ "2. Under **TEMPLATE**, select `Binary classification`.\n", "\n", "3. Click **Use Template** to apply the template." - ], - "id": "7167d002" + ] }, { "cell_type": "markdown", + "id": "43037f46", "metadata": {}, "source": [ "<a id='toc3_2_3__'></a>\n", @@ -275,12 +276,14 @@ "2. Click **Copy snippet to clipboard**.\n", "\n", "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ], - "id": "43037f46" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "e2c1dd22", "metadata": {}, + "outputs": [], "source": [ "# Load your model identifier credentials from an `.env` file\n", "\n", @@ -298,13 +301,11 @@ " # model=\"...\",\n", " document=\"documentation\",\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "e2c1dd22" + ] }, { "cell_type": "markdown", + "id": "1a6933d3", "metadata": {}, "source": [ "<a id='toc3_3__'></a>\n", @@ -315,33 +316,33 @@ "\n", "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ], - "id": "1a6933d3" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "62d7c2c1", "metadata": {}, + "outputs": [], "source": [ "import xgboost as xgb\n", "\n", "%matplotlib inline" - ], - "execution_count": null, - "outputs": [], - "id": "62d7c2c1" + ] }, { "cell_type": "markdown", + "id": "fafe8fc2", "metadata": {}, "source": [ "<a id='toc4__'></a>\n", "\n", "## Getting to know ValidMind" - ], - "id": "fafe8fc2" + ] }, { "cell_type": "markdown", + "id": "d7ee565f", "metadata": {}, "source": [ "<a id='toc4_1__'></a>\n", @@ -351,21 +352,21 @@ "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", "\n", "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ], - "id": "d7ee565f" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "b2bce375", "metadata": {}, + "outputs": [], "source": [ "vm.preview_template()" - ], - "execution_count": null, - "outputs": [], - "id": "b2bce375" + ] }, { "cell_type": "markdown", + "id": "fa0e43cb", "metadata": {}, "source": [ "<a id='toc4_2__'></a>\n", @@ -379,31 +380,31 @@ "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", "\n", "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ], - "id": "fa0e43cb" + ] }, { "cell_type": "markdown", + "id": "9d0d1005", "metadata": {}, "source": [ "<a id='toc5__'></a>\n", "\n", "## Working with ValidMind datasets" - ], - "id": "9d0d1005" + ] }, { "cell_type": "markdown", + "id": "1b94e39f", "metadata": {}, "source": [ "<a id='toc5_1__'></a>\n", "\n", "### Prepare the sample dataset" - ], - "id": "1b94e39f" + ] }, { "cell_type": "markdown", + "id": "6fc79fc1", "metadata": {}, "source": [ "<a id='toc5_1_1__'></a>\n", @@ -416,12 +417,14 @@ "\n", "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ], - "id": "6fc79fc1" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "58d1c94b", "metadata": {}, + "outputs": [], "source": [ "from validmind.datasets.classification import customer_churn\n", "\n", @@ -431,13 +434,11 @@ "\n", "raw_df = customer_churn.load_data()\n", "raw_df.head()" - ], - "execution_count": null, - "outputs": [], - "id": "58d1c94b" + ] }, { "cell_type": "markdown", + "id": "4fe0f216", "metadata": {}, "source": [ "<a id='toc5_1_2__'></a>\n", @@ -445,11 +446,11 @@ "#### Preprocess the raw dataset\n", "\n", "Before running tests with ValidMind, we'll need to preprocess our imported dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs)." - ], - "id": "4fe0f216" + ] }, { "cell_type": "markdown", + "id": "9f690a04", "metadata": {}, "source": [ "<a id='toc5_1_3__'></a>\n", @@ -463,21 +464,21 @@ "1. **train_df** — Used to train the model.\n", "2. **validation_df** — Used to evaluate the model's performance during training.\n", "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ], - "id": "9f690a04" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "418cb5aa", "metadata": {}, + "outputs": [], "source": [ "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ], - "execution_count": null, - "outputs": [], - "id": "418cb5aa" + ] }, { "cell_type": "markdown", + "id": "a9ad2104", "metadata": {}, "source": [ "<a id='toc5_1_4__'></a>\n", @@ -490,24 +491,24 @@ "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", "\n", "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" - ], - "id": "a9ad2104" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "6fd365fd", "metadata": {}, + "outputs": [], "source": [ "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", "y_train = train_df[customer_churn.target_column]\n", "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", "y_val = validation_df[customer_churn.target_column]" - ], - "execution_count": null, - "outputs": [], - "id": "6fd365fd" + ] }, { "cell_type": "markdown", + "id": "73d767d7", "metadata": {}, "source": [ "<a id='toc5_2__'></a>\n", @@ -522,12 +523,14 @@ "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", "- **`class_labels`** — An optional value to map predicted classes to class labels." - ], - "id": "73d767d7" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "bb6ad06a", "metadata": {}, + "outputs": [], "source": [ "# Initialize the raw dataset\n", "vm_raw_dataset = vm.init_dataset(\n", @@ -550,23 +553,21 @@ " input_id=\"test_dataset\",\n", " target_column=customer_churn.target_column\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "bb6ad06a" + ] }, { "cell_type": "markdown", + "id": "0b33afca", "metadata": {}, "source": [ "<a id='toc6__'></a>\n", "\n", "## Working with ValidMind models" - ], - "id": "0b33afca" + ] }, { "cell_type": "markdown", + "id": "5962362c", "metadata": {}, "source": [ "<a id='toc6_1__'></a>\n", @@ -576,21 +577,21 @@ "Next, let's create an XGBoost classifier model that will automatically stop training if it doesn’t improve after 10 tries.\n", "\n", "Setting a threshold avoids wasting time and helps prevent overfitting by stopping training when further improvement isn’t happening." - ], - "id": "5962362c" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "3296cac6", "metadata": {}, + "outputs": [], "source": [ "model = xgb.XGBClassifier(early_stopping_rounds=10)" - ], - "execution_count": null, - "outputs": [], - "id": "3296cac6" + ] }, { "cell_type": "markdown", + "id": "33cafbcf", "metadata": {}, "source": [ "<a id='toc6_1_1__'></a>\n", @@ -604,23 +605,23 @@ "3. **auc** — Evaluates how well the model distinguishes between churn and not churn.\n", "\n", "Using multiple metrics gives a more complete picture of how good (or bad) the model is." - ], - "id": "33cafbcf" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "32d3c3f4", "metadata": {}, + "outputs": [], "source": [ "model.set_params(\n", " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "32d3c3f4" + ] }, { "cell_type": "markdown", + "id": "47d84a80", "metadata": {}, "source": [ "<a id='toc6_1_2__'></a>\n", @@ -631,12 +632,14 @@ "\n", "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", "- To turn off printed output while training, we'll set `verbose` to `False`." - ], - "id": "47d84a80" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "3fb95ce4", "metadata": {}, + "outputs": [], "source": [ "model.fit(\n", " x_train,\n", @@ -644,13 +647,11 @@ " eval_set=[(x_val, y_val)],\n", " verbose=False,\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "3fb95ce4" + ] }, { "cell_type": "markdown", + "id": "23bccb27", "metadata": {}, "source": [ "<a id='toc6_2__'></a>\n", @@ -663,24 +664,24 @@ "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", "\n", "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ], - "id": "23bccb27" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "0e44eebd", "metadata": {}, + "outputs": [], "source": [ "vm_model = vm.init_model(\n", " model,\n", " input_id=\"model\",\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "0e44eebd" + ] }, { "cell_type": "markdown", + "id": "20c008bf", "metadata": {}, "source": [ "<a id='toc6_3__'></a>\n", @@ -693,12 +694,14 @@ "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", "\n", "If no prediction values are passed, the method will compute predictions automatically:" - ], - "id": "20c008bf" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "62bd94fc", "metadata": {}, + "outputs": [], "source": [ "vm_train_ds.assign_predictions(\n", " model=vm_model,\n", @@ -707,13 +710,11 @@ "vm_test_ds.assign_predictions(\n", " model=vm_model,\n", ")" - ], - "execution_count": null, - "outputs": [], - "id": "62bd94fc" + ] }, { "cell_type": "markdown", + "id": "0e66a7cd", "metadata": {}, "source": [ "<a id='toc7__'></a>\n", @@ -746,44 +747,44 @@ " ```\n", "\n", " Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ], - "id": "0e66a7cd" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "b3d6741b", "metadata": {}, + "outputs": [], "source": [ "from validmind.utils import preview_test_config\n", "\n", "test_config = customer_churn.get_demo_test_config()\n", "preview_test_config(test_config)" - ], - "execution_count": null, - "outputs": [], - "id": "b3d6741b" + ] }, { "cell_type": "markdown", + "id": "7eebd40f", "metadata": {}, "source": [ "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests.\n", "\n", "The variable `full_suite` then holds the result of these tests:" - ], - "id": "7eebd40f" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "ae3accf7", "metadata": {}, + "outputs": [], "source": [ "full_suite = vm.run_documentation_tests(config=test_config)" - ], - "execution_count": null, - "outputs": [], - "id": "ae3accf7" + ] }, { "cell_type": "markdown", + "id": "ed61fa23", "metadata": {}, "source": [ "<a id='toc8__'></a>\n", @@ -799,11 +800,11 @@ "- [x] Initialize ValidMind datasets and model objects\n", "- [x] Assign model predictions to your ValidMind model objects\n", "- [x] Run a full suite of documentation tests" - ], - "id": "ed61fa23" + ] }, { "cell_type": "markdown", + "id": "68803cd9", "metadata": {}, "source": [ "<a id='toc9__'></a>\n", @@ -811,11 +812,11 @@ "## Next steps\n", "\n", "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ], - "id": "68803cd9" + ] }, { "cell_type": "markdown", + "id": "ba38b729", "metadata": {}, "source": [ "<a id='toc9_1__'></a>\n", @@ -827,11 +828,11 @@ "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", "\n", "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ], - "id": "ba38b729" + ] }, { "cell_type": "markdown", + "id": "ae046dc4", "metadata": {}, "source": [ "<a id='toc9_2__'></a>\n", @@ -850,11 +851,11 @@ "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", "\n", "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ], - "id": "ae046dc4" + ] }, { "cell_type": "markdown", + "id": "4ce38015", "metadata": {}, "source": [ "<a id='toc10__'></a>\n", @@ -864,21 +865,21 @@ "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", "\n", "Retrieve the information for the currently installed version of ValidMind:" - ], - "id": "4ce38015" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "35955b6b", "metadata": {}, + "outputs": [], "source": [ "%pip show validmind" - ], - "execution_count": null, - "outputs": [], - "id": "35955b6b" + ] }, { "cell_type": "markdown", + "id": "f865e64e", "metadata": {}, "source": [ "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", @@ -886,19 +887,19 @@ "```bash\n", "%pip install --upgrade validmind\n", "```" - ], - "id": "f865e64e" + ] }, { "cell_type": "markdown", + "id": "65b36aa7", "metadata": {}, "source": [ "You may need to restart your kernel after running the upgrade package for changes to be applied." - ], - "id": "65b36aa7" + ] }, { "cell_type": "markdown", + "id": "copyright-bd87da591b88473997979690dbffcfa5", "metadata": {}, "source": [ "<!-- VALIDMIND COPYRIGHT -->\n", @@ -910,8 +911,7 @@ "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ], - "id": "copyright-bd87da591b88473997979690dbffcfa5" + ] } ], "metadata": { diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index 26fdb1b66..6d70032d1 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -151,7 +151,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", @@ -168,7 +168,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index 6a48fbec9..0c3246490 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -50,7 +50,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -65,7 +65,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index 1ec766bee..c48fe45ea 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -50,7 +50,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", @@ -65,7 +65,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index a8c50fe1c..c68efd338 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -50,7 +50,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", @@ -67,7 +67,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index da1f59c24..64662bf8c 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -129,7 +129,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -144,7 +144,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index e8797be51..8ec773ec3 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -131,7 +131,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", @@ -148,7 +148,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index ee1a247c8..299a4a586 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -152,7 +152,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -167,7 +167,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index a8333a2c7..c298af095 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -109,7 +109,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -124,7 +124,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index 4ce1ace78..3603ff206 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -148,7 +148,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -163,7 +163,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 3dd777057..b751cd622 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -112,7 +112,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -127,7 +127,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index e72882570..2f87a9fda 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -91,7 +91,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -106,7 +106,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index 0a0b266c5..a29bf027c 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -105,7 +105,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -120,7 +120,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index ca089e069..ec37d78c5 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -121,7 +121,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index 4dc5b89a4..fe89d90a9 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -118,7 +118,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -133,7 +133,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 9c32c138c..25987e1f5 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -102,7 +102,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -117,7 +117,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index 91aab5e6b..c77f2e120 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -87,7 +87,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -102,7 +102,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index 124244318..81f7310db 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -98,7 +98,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", @@ -113,7 +113,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index dc5995422..4fdc7375d 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -96,7 +96,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", @@ -111,7 +111,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 503d4a922..98a20c0eb 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -97,7 +97,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -112,7 +112,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index a22357ab9..37d5b8c99 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -113,7 +113,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index 6f821afee..1e62759dc 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -143,7 +143,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and function as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", @@ -160,7 +160,7 @@ "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", "\n", " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", "\n", From 1a130d0c44a81e7af93a6e685dcd1daa1283efb6 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 12:21:57 -0700 Subject: [PATCH 12/13] esfhjks --- notebooks/code_sharing/r/r_custom_tests.Rmd | 2 +- .../dataset_inputs/configure_dataset_features.ipynb | 2 +- .../dataset_inputs/load_datasets_predictions.ipynb | 2 +- notebooks/how_to/metrics/log_metrics_over_time.ipynb | 2 +- .../how_to/qualitative_text/qualitative_text_generation.ipynb | 2 +- .../how_to/tests/custom_tests/implement_custom_tests.ipynb | 2 +- notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb | 2 +- notebooks/how_to/tests/explore_tests/explore_tests.ipynb | 2 +- .../how_to/tests/run_tests/1-run_dataset-based_tests.ipynb | 2 +- notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb | 2 +- .../tests/run_tests/configure_tests/enable_pii_detection.ipynb | 2 +- .../run_tests_that_require_multiple_datasets.ipynb | 2 +- .../document_multiple_results_for_the_same_test.ipynb | 2 +- .../documentation_tests/run_documentation_sections.ipynb | 2 +- .../run_documentation_tests_with_config.ipynb | 2 +- notebooks/quickstart/quickstart_documentation.ipynb | 2 +- notebooks/quickstart/quickstart_validation.ipynb | 2 +- .../templates/about-validmind/_about-validmind-developers.ipynb | 2 +- .../templates/about-validmind/_about-validmind-monitoring.ipynb | 2 +- .../templates/about-validmind/_about-validmind-validators.ipynb | 2 +- notebooks/tutorials/development/1-set_up_validmind.ipynb | 2 +- .../validation/1-set_up_validmind_for_validation.ipynb | 2 +- notebooks/use_cases/agents/document_agentic_ai.ipynb | 2 +- .../capital_markets/quickstart_option_pricing_models.ipynb | 2 +- .../quickstart_option_pricing_models_quantlib.ipynb | 2 +- .../code_explainer/quickstart_code_explainer_demo.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_executive.ipynb | 2 +- .../credit_risk/application_scorecard_full_suite.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_with_bias.ipynb | 2 +- .../use_cases/credit_risk/application_scorecard_with_ml.ipynb | 2 +- .../credit_risk/document_excel_application_scorecard.ipynb | 2 +- notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb | 2 +- .../application_scorecard_ongoing_monitoring.ipynb | 2 +- .../quickstart_customer_churn_ongoing_monitoring.ipynb | 2 +- .../time_series/quickstart_time_series_full_suite.ipynb | 2 +- .../time_series/quickstart_time_series_high_code.ipynb | 2 +- .../use_cases/validation/validate_application_scorecard.ipynb | 2 +- 37 files changed, 37 insertions(+), 37 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index 90d2ec4c6..ce674695e 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -47,7 +47,7 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. -**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. +**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. **documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index 7751e9ef6..ee635df97 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -86,7 +86,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index d1dcfa6b3..4e4a6d4af 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 660859b70..436c43413 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 581a1e58d..29735259a 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -121,7 +121,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index 8cd46fba8..a7e816066 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index b7b4a7fb9..0f4020785 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -78,7 +78,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 374734d04..c5dca3241 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -79,7 +79,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 620b4a80f..0979b97b1 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -111,7 +111,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index de5fc3434..314436a70 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -123,7 +123,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 313152c51..0e3d49ae2 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -114,7 +114,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 0031dfbbb..81ead2e09 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -96,7 +96,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index f27a82691..0c69d6413 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 995d61a9f..f2b02c802 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -94,7 +94,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index 8a20a36e8..b0c13b2f9 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index 81750a13b..0867cf1fd 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -146,7 +146,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index 6d70032d1..3a3542e16 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -151,7 +151,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index 0c3246490..fe479f4bf 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -50,7 +50,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index c48fe45ea..f5e930d7e 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -50,7 +50,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index c68efd338..3428afbdd 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -50,7 +50,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index 64662bf8c..fe00b4c40 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -129,7 +129,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index 8ec773ec3..230605c5f 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -131,7 +131,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index 299a4a586..44df39f41 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -152,7 +152,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index c298af095..53eebc7f4 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -109,7 +109,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index 3603ff206..b43072555 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -148,7 +148,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index b751cd622..9b3548521 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -112,7 +112,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 2f87a9fda..61a4643b7 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -91,7 +91,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index a29bf027c..d2e25f034 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -105,7 +105,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index ec37d78c5..5f04b0a54 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index fe89d90a9..4d56ab8a6 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -118,7 +118,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 25987e1f5..01497c866 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -102,7 +102,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index c77f2e120..e572ad2c8 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -87,7 +87,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index 81f7310db..599f59edd 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -98,7 +98,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index 4fdc7375d..6a6954b9a 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -96,7 +96,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 98a20c0eb..e61e3a3db 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -97,7 +97,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index 37d5b8c99..002d43995 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index 1e62759dc..e22cc392f 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -143,7 +143,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as test suites specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", From c66248b4b85e01fac1cc226373ddc02616fb4a61 Mon Sep 17 00:00:00 2001 From: Beck <164545837+validbeck@users.noreply.github.com> Date: Tue, 26 May 2026 12:24:09 -0700 Subject: [PATCH 13/13] omg --- notebooks/code_sharing/r/r_custom_tests.Rmd | 2 +- .../configure_dataset_features.ipynb | 2 +- .../load_datasets_predictions.ipynb | 2 +- .../how_to/metrics/log_metrics_over_time.ipynb | 2 +- .../qualitative_text_generation.ipynb | 2 +- .../custom_tests/implement_custom_tests.ipynb | 2 +- .../explore_tests/explore_test_suites.ipynb | 2 +- .../tests/explore_tests/explore_tests.ipynb | 2 +- .../run_tests/1-run_dataset-based_tests.ipynb | 2 +- .../run_tests/2-run_comparison_tests.ipynb | 2 +- .../configure_tests/enable_pii_detection.ipynb | 2 +- ..._tests_that_require_multiple_datasets.ipynb | 2 +- ...nt_multiple_results_for_the_same_test.ipynb | 2 +- .../run_documentation_sections.ipynb | 2 +- .../run_documentation_tests_with_config.ipynb | 2 +- .../quickstart/quickstart_documentation.ipynb | 2 +- .../quickstart/quickstart_validation.ipynb | 2 +- .../_about-validmind-developers.ipynb | 18 +++++++++--------- .../_about-validmind-monitoring.ipynb | 2 +- .../_about-validmind-validators.ipynb | 2 +- .../development/1-set_up_validmind.ipynb | 2 +- .../1-set_up_validmind_for_validation.ipynb | 2 +- .../use_cases/agents/document_agentic_ai.ipynb | 2 +- .../quickstart_option_pricing_models.ipynb | 2 +- ...kstart_option_pricing_models_quantlib.ipynb | 2 +- .../quickstart_code_explainer_demo.ipynb | 2 +- .../application_scorecard_executive.ipynb | 2 +- .../application_scorecard_full_suite.ipynb | 2 +- .../application_scorecard_with_bias.ipynb | 2 +- .../application_scorecard_with_ml.ipynb | 2 +- .../document_excel_application_scorecard.ipynb | 2 +- .../nlp_and_llm/prompt_validation_demo.ipynb | 2 +- ...lication_scorecard_ongoing_monitoring.ipynb | 2 +- ...art_customer_churn_ongoing_monitoring.ipynb | 2 +- .../quickstart_time_series_full_suite.ipynb | 2 +- .../quickstart_time_series_high_code.ipynb | 2 +- .../validate_application_scorecard.ipynb | 2 +- 37 files changed, 45 insertions(+), 45 deletions(-) diff --git a/notebooks/code_sharing/r/r_custom_tests.Rmd b/notebooks/code_sharing/r/r_custom_tests.Rmd index ce674695e..9c1db6ed5 100644 --- a/notebooks/code_sharing/r/r_custom_tests.Rmd +++ b/notebooks/code_sharing/r/r_custom_tests.Rmd @@ -47,7 +47,7 @@ Signing up is FREE — <a href="https://docs.validmind.ai/guide/access/register- **documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. -**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. +**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types. **documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index ee635df97..1b9ab41da 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -86,7 +86,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index 4e4a6d4af..a98ff348b 100644 --- a/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 436c43413..7e8d1faef 100644 --- a/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index 29735259a..f2c72ce7b 100644 --- a/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -121,7 +121,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index a7e816066..38ad4a308 100644 --- a/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index 0f4020785..2191dbd98 100644 --- a/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -78,7 +78,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index c5dca3241..048459ea7 100644 --- a/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -79,7 +79,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index 0979b97b1..ae4b200aa 100644 --- a/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -111,7 +111,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index 314436a70..1766a413f 100644 --- a/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -123,7 +123,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 0e3d49ae2..bc07a3cff 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -114,7 +114,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 81ead2e09..6a4b81ba2 100644 --- a/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -96,7 +96,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 0c69d6413..fc7446d03 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -103,7 +103,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index f2b02c802..42ef742f8 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -94,7 +94,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index b0c13b2f9..48a3e439d 100644 --- a/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/quickstart/quickstart_documentation.ipynb b/notebooks/quickstart/quickstart_documentation.ipynb index 0867cf1fd..033e02345 100644 --- a/notebooks/quickstart/quickstart_documentation.ipynb +++ b/notebooks/quickstart/quickstart_documentation.ipynb @@ -146,7 +146,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/quickstart/quickstart_validation.ipynb b/notebooks/quickstart/quickstart_validation.ipynb index 3a3542e16..4d871e122 100644 --- a/notebooks/quickstart/quickstart_validation.ipynb +++ b/notebooks/quickstart/quickstart_validation.ipynb @@ -151,7 +151,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb index fe479f4bf..3259c50f6 100644 --- a/notebooks/templates/about-validmind/_about-validmind-developers.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-developers.ipynb @@ -2,6 +2,7 @@ "cells": [ { "cell_type": "markdown", + "id": "about-intro", "metadata": {}, "source": [ "## About ValidMind\n", @@ -9,11 +10,11 @@ "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", "\n", "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ], - "id": "about-intro" + ] }, { "cell_type": "markdown", + "id": "about-begin", "metadata": {}, "source": [ "### Before you begin\n", @@ -21,11 +22,11 @@ "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", "\n", "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ], - "id": "about-begin" + ] }, { "cell_type": "markdown", + "id": "about-signup", "metadata": {}, "source": [ "### New to ValidMind?\n", @@ -35,11 +36,11 @@ "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", "<br></br>\n", "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ], - "id": "about-signup" + ] }, { "cell_type": "markdown", + "id": "about-concepts", "metadata": {}, "source": [ "### Key concepts\n", @@ -50,7 +51,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", @@ -72,8 +73,7 @@ "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", "\n", "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ], - "id": "about-concepts" + ] } ], "metadata": { diff --git a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb index f5e930d7e..5fdb4d43c 100644 --- a/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-monitoring.ipynb @@ -50,7 +50,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb index 3428afbdd..79bf0adc8 100644 --- a/notebooks/templates/about-validmind/_about-validmind-validators.ipynb +++ b/notebooks/templates/about-validmind/_about-validmind-validators.ipynb @@ -50,7 +50,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/tutorials/development/1-set_up_validmind.ipynb b/notebooks/tutorials/development/1-set_up_validmind.ipynb index fe00b4c40..9ba543104 100644 --- a/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -129,7 +129,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index 230605c5f..feda59a35 100644 --- a/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -131,7 +131,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n", diff --git a/notebooks/use_cases/agents/document_agentic_ai.ipynb b/notebooks/use_cases/agents/document_agentic_ai.ipynb index 44df39f41..621fe8b17 100644 --- a/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -152,7 +152,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index 53eebc7f4..4b4ae386c 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -109,7 +109,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index b43072555..eccfb8fc3 100644 --- a/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -148,7 +148,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 9b3548521..4f912501f 100644 --- a/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -112,7 +112,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 61a4643b7..50f2f0202 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -91,7 +91,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index d2e25f034..2b857a03b 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -105,7 +105,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index 5f04b0a54..6f6d23928 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -106,7 +106,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index 4d56ab8a6..a735cbf5b 100644 --- a/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -118,7 +118,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 01497c866..fa8e86113 100644 --- a/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -102,7 +102,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index e572ad2c8..deeb8293e 100644 --- a/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -87,7 +87,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index 599f59edd..847417cf0 100644 --- a/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -98,7 +98,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index 6a6954b9a..9be6aa92f 100644 --- a/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -96,7 +96,7 @@ "\n", "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index e61e3a3db..300dbfeb0 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -97,7 +97,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index 002d43995..1dfae1e06 100644 --- a/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -98,7 +98,7 @@ "\n", "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", "\n", diff --git a/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/notebooks/use_cases/validation/validate_application_scorecard.ipynb index e22cc392f..563c622a2 100644 --- a/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -143,7 +143,7 @@ "\n", "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", "\n", - "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suitespecifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", "\n", "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", "\n",