Skip to content

[SC 16234] Two small fixes to built-in ValidMind tests#512

Merged
AnilSorathiya merged 1 commit into
mainfrom
anilsorathiya/sc-16234/two-small-fixes-to-built-in-validmind-tests
May 19, 2026
Merged

[SC 16234] Two small fixes to built-in ValidMind tests#512
AnilSorathiya merged 1 commit into
mainfrom
anilsorathiya/sc-16234/two-small-fixes-to-built-in-validmind-tests

Conversation

@AnilSorathiya
Copy link
Copy Markdown
Contributor

Pull Request Description

What and why?

IQROutliersBarPlot — Boolean and binary columns (≤2 unique values) are now excluded from plots and from the raw outlier summary table. Before, they could appear in raw data or cause quantile errors on boolean dtypes. After, only meaningful numeric features are analyzed.

import numpy as np
import pandas as pd
import validmind as vm
df = pd.DataFrame({
    "numeric": np.random.randn(100),
    "flag": np.random.choice([True, False], 100),
})
dataset = vm.init_dataset(input_id="ds", dataset=df, __log=False)
result = vm.tests.run_test(
    "validmind.data_validation.IQROutliersBarPlot",
    inputs={"dataset": dataset},
)
result.show()

WeakspotsDiagnosis — Custom thresholds can now specify only some metrics (e.g. {"accuracy": 0.65}). Before, missing keys could break plots or pass/fail logic. After, plots use defaults for metrics without a custom threshold (reference lines still show), and pass/fail only checks the thresholds you provide.

result = vm.tests.run_test(
    "validmind.model_validation.sklearn.WeakspotsDiagnosis",
    inputs={"datasets": [train_ds, test_ds], "model": model},
    params={"thresholds": {"accuracy": 0.65}},  # only accuracy used for pass/fail
)

How to test

pytest tests/unit_tests/data_validation/test_IQROutliersBarPlot.py -v
pytest tests/unit_tests/model_validation/sklearn/test_WeakspotsDiagnosis.py -v

Manual checks (optional):

  • Run IQROutliersBarPlot on a dataset with boolean columns; confirm they are not in the bar plots or raw outlier table.
  • Run WeakspotsDiagnosis with thresholds={"accuracy": 0.65}; confirm plots show reference lines for all metrics and pass/fail only uses accuracy.

What needs special review?

  • WeakspotsDiagnosis: split between plot_thresholds (defaults + overrides) and pass_thresholds (user-only when custom thresholds are passed).
  • IQROutliersBarPlot: eligible_columns filter applied consistently to plots and raw data.

Dependencies, breaking changes, and deployment notes

Release notes

  • IQROutliersBarPlot: Boolean and binary features are excluded from outlier analysis and summary output.
  • WeakspotsDiagnosis: You can pass partial custom thresholds; plots still show default reference lines for other metrics, and pass/fail only evaluates the thresholds you set.

Checklist

  • What and why
  • Screenshots or videos (Frontend) — N/A
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied
  • PR linked to Shortcut (SC-16234)
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required)

@AnilSorathiya AnilSorathiya added the bug Something isn't working label May 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Summary

This PR introduces two main functional improvements:

  1. Outlier Detection Improvement:

    • The outlier bar plot now excludes boolean (and binary) features from both the visualization and the raw data output. The changes include filtering the eligible numeric columns to only those with more than two unique values. This ensures that IQR computations, which are not meaningful on binary data, are not performed on boolean columns.
    • A new unit test (test_boolean_dtype_excluded_from_raw_data) has been added to verify that boolean features are properly excluded from the raw data used for plotting.
  2. Weakspots Diagnosis Threshold Handling:

    • A new helper function (_prepare_metrics_and_thresholds) has been introduced to normalize metric and threshold keys (e.g., converting 'f1' to 'F1') and to merge user-specified thresholds with default thresholds. This ensures consistent behavior both in plotting (using default thresholds when unspecified) and in pass/fail evaluations (using only the user-provided thresholds).
    • Additional unit tests have been added to validate that the thresholds are correctly processed and that partial threshold inputs behave as expected.

Overall, the PR improves the robustness of data validation by avoiding the application of statistical methods to inappropriate data types and enhances the diagnostic reporting by ensuring thresholds are normalized and merged appropriately.

Test Suggestions

  • Run the newly added unit tests for outlier detection and weakspots diagnosis to ensure all scenarios pass.
  • Test with a dataset that contains mixed boolean and numeric features to verify the boolean exclusion logic.
  • Verify that when partial thresholds are provided, the merged thresholds (both pass and plotting) behave as expected.
  • Integrate with the full diagnostic pipeline to ensure that the changes do not affect downstream reporting or visualization.

Copy link
Copy Markdown
Contributor

@johnwalz97 johnwalz97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Copy Markdown
Contributor

@juanmleng juanmleng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic, thanks @AnilSorathiya!

@AnilSorathiya AnilSorathiya merged commit 74c3934 into main May 19, 2026
22 checks passed
@AnilSorathiya AnilSorathiya deleted the anilsorathiya/sc-16234/two-small-fixes-to-built-in-validmind-tests branch May 19, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants