Allow custom classifiers via ipw model parameter #177

neuralsorcerer · 2025-11-26T14:12:19Z

Related to [FEATURE] balance 0.13.0 - missing steps #175 and [FEATURE] Support more models from sklearn (other than logistic regression) #139

meta-codesync · 2025-11-26T14:43:50Z

@talgalili has imported this pull request. If you are a Meta employee, you can view this in D87926504.

Copilot

Pull request overview

This PR enhances the ipw() function to accept custom sklearn classifiers directly via the model parameter, replacing the previous sklearn_model parameter. Users can now pass any sklearn classifier implementing fit and predict_proba (e.g., RandomForestClassifier) or use the default logistic regression by specifying model="sklearn".

Key Changes:

The model parameter now accepts sklearn classifiers in addition to string identifiers
The sklearn_model parameter is deprecated in favor of model
Added comprehensive test coverage for the new parameter behavior
Created a new tutorial notebook demonstrating both default and custom classifier usage

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
balance/weighting_methods/ipw.py	Updated `ipw()` signature and logic to accept classifiers via `model` parameter, deprecated `sklearn_model`, improved error messages and documentation
tests/test_ipw.py	Added tests for new `model` parameter, conflicting arguments validation, and error handling
tutorials/balance_quickstart_ipw.ipynb	New tutorial demonstrating IPW with default logistic regression and custom RandomForestClassifier
CHANGELOG.md	Updated documentation to reflect the API change from `sklearn_model` to `model` parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

balance/weighting_methods/ipw.py

facebook-github-bot · 2025-11-26T14:47:34Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

talgalili

Looking good. Please see my comments

balance/weighting_methods/ipw.py

talgalili · 2025-11-26T14:46:35Z

balance/weighting_methods/ipw.py

+            Defaults to None.
+
+    Examples:
+        >>> from sklearn.ensemble import RandomForestClassifier


best to add an example that uses the simulated data.

Updated in 2a8d6d6

Thanks.
Notice that in the tutorials, you get the processed output of the code command. In the examples, they are not executed on the website - so it's worth adding here also the output.

balance/weighting_methods/ipw.py

facebook-github-bot · 2025-11-26T15:36:25Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2025-11-26T15:39:30Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

CHANGELOG.md

facebook-github-bot · 2025-11-26T15:54:00Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copilot · 2025-11-26T17:29:53Z

balance/weighting_methods/ipw.py

-            logger.warning(
-                "penalty_factor is ignored when using a custom sklearn_model."
-            )
+            logger.warning("penalty_factor is ignored when using a custom model.")


The warning message should clarify that penalty_factor is only supported for the default logistic regression. Consider: 'penalty_factor is only supported with the default logistic regression model and will be ignored when using a custom classifier.'

Suggested change

logger.warning("penalty_factor is ignored when using a custom model.")

logger.warning("penalty_factor is only supported with the default logistic regression model and will be ignored when using a custom classifier.")

Copilot · 2025-11-26T17:29:53Z

balance/weighting_methods/ipw.py

+        model_name = model
+    else:
+        raise TypeError(
+            "model must be 'sklearn', an sklearn classifier implementing predict_proba, or None"


The error message mentions None as a valid option, but None is effectively treated as 'sklearn'. Consider clarifying: 'model must be "sklearn" (string), an sklearn classifier implementing predict_proba, or None (defaults to logistic regression)'

Suggested change

"model must be 'sklearn', an sklearn classifier implementing predict_proba, or None"

"model must be 'sklearn' (string), an sklearn classifier implementing predict_proba, or None (defaults to logistic regression)"

Copilot · 2025-11-26T17:29:54Z

balance/weighting_methods/ipw.py

        if not hasattr(custom_model, "predict_proba"):
            raise ValueError(
-                "The provided sklearn_model must implement predict_proba for propensity estimation."
+                "The provided custom model must implement predict_proba for propensity estimation."


This error message could be more actionable. Consider: 'The provided custom model must implement the predict_proba method for propensity estimation. Ensure your classifier inherits from sklearn.base.ClassifierMixin and defines predict_proba.'

Suggested change

"The provided custom model must implement predict_proba for propensity estimation."

"The provided custom model must implement the predict_proba method for propensity estimation. "

"Ensure your classifier inherits from sklearn.base.ClassifierMixin and defines predict_proba."

Copilot · 2025-11-26T17:29:54Z

tests/test_ipw.py

+    def test_ipw_supports_custom_model_parameter(self) -> None:
+        """The ``model`` parameter accepts sklearn classifiers directly."""


This test duplicates the coverage already provided by test_ipw_supports_custom_sklearn_model. Consider removing this test or expanding it to verify distinct behavior not covered by the existing test, such as testing with a different classifier or edge case.

talgalili

Great update. I made a bunch of comments - please review.

talgalili · 2025-11-26T17:30:42Z

balance/weighting_methods/ipw.py

+            Defaults to None.
+
+    Examples:
+        >>> from sklearn.ensemble import RandomForestClassifier


Thanks.
Notice that in the tutorials, you get the processed output of the code command. In the examples, they are not executed on the website - so it's worth adding here also the output.

talgalili · 2025-11-26T17:33:08Z

balance/weighting_methods/ipw.py

-    using_default_logistic = sklearn_model is None
+    using_default_logistic = custom_model is None

    if using_default_logistic:


no need to keep this as a variable. Just use
custom_model is None
And add as a comment # using_default_logistic

talgalili · 2025-11-26T17:34:43Z

balance/cli.py

        return self.args.weight_trimming_mean_ratio

-    def logistic_regression_kwargs(self) -> Dict[str, Any] | None:
-        raw_kwargs = self.args.ipw_logistic_regression_kwargs


oh no, please don't remove this. It's a great addition!
Just change it so that it works and uses the ipw_logistic_regression_kwargs as input to train a logisticregression model inside this function.

talgalili · 2025-11-26T17:36:10Z

tutorials/balance_quickstart_ipw.ipynb

@@ -0,0 +1,495 @@
+{


Since
tutorials/balance_quickstart_ipw.ipynb
is identical to
tutorials/balance_quickstart.ipynb, just with a few more examples - I suggest you just add them to
tutorials/balance_quickstart.ipynb

talgalili · 2025-11-26T17:36:58Z

CHANGELOG.md

-  - Added `logistic_regression_kwargs` parameter to `ipw()` for customizing
-    sklearn LogisticRegression settings
-    ([#138](https://github.com/facebookresearch/balance/pull/138)).
-  - CLI now supports `--ipw_logistic_regression_kwargs` for passing custom


As written above - I think keeping this in the CLI is a good idea.

facebook-github-bot · 2025-11-26T20:00:00Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2025-11-26T20:11:23Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2025-11-26T20:22:57Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

balance/weighting_methods/ipw.py

balance/cli.py

tests/test_ipw.py

talgalili

copilot suggestions are good - please address them.
There are some linter issues from meta, but these are not things for you to fix (they deal with internal files). So I'll deal with that before landing.

Since there are no github workflows yet, I'll also update if there are any test failures (but I'll know only a bit later once it finishes running internally).

p.s.: thanks for all the commits, very cool work (and I have more ideas for stuff to do moving forward - but let's close 0.13.0 first)

facebook-github-bot · 2025-11-27T06:48:52Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

talgalili

FYI: this works seems good, THANKS @neuralsorcerer !

I'll pull this and make some changes (some internal linter issues, and also I'll update the examples to work).

Since this is a 'heavy' PR, it might take my other metamates friends a few days to review/accept/land.

neuralsorcerer · 2025-11-27T09:39:52Z

Thank you for all the help @talgalili :)

talgalili · 2025-11-27T22:14:03Z

FYI:
@neuralsorcerer due to thanks-givings, my other friends/colleages are away.
This diff will be reviewed by them next week.
Have a great weekend.

neuralsorcerer · 2025-11-28T04:05:33Z

Happy weekend bro :)

meta-codesync · 2025-12-01T19:10:14Z

@talgalili merged this pull request in 4e22220.

neuralsorcerer added 2 commits November 26, 2025 19:20

Allow custom classifiers via ipw model parameter

8444951

Add examples

946a710

meta-cla bot added the cla signed label Nov 26, 2025

Add to changelog about tutorial

919c2ba

talgalili requested a review from Copilot November 26, 2025 14:43

Copilot AI reviewed Nov 26, 2025

View reviewed changes

balance/weighting_methods/ipw.py Outdated Show resolved Hide resolved

balance/weighting_methods/ipw.py Outdated Show resolved Hide resolved

Update per suggestions

bc391c8

talgalili requested changes Nov 26, 2025

View reviewed changes

Update as per suggestions

2a8d6d6

neuralsorcerer requested a review from talgalili November 26, 2025 15:37

Update as per suggestions

8791583

neuralsorcerer marked this pull request as draft November 26, 2025 15:44

talgalili requested changes Nov 26, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

Remove IPW logreg CLI and update changelog

2dd774f

neuralsorcerer marked this pull request as ready for review November 26, 2025 15:54

neuralsorcerer requested a review from talgalili November 26, 2025 15:54

talgalili requested a review from Copilot November 26, 2025 17:29

Copilot AI reviewed Nov 26, 2025

View reviewed changes

talgalili requested changes Nov 26, 2025

View reviewed changes

Restore IPW CLI logistic customization

c5e59cb

Resolve conflicts

4125094

Merge branch 'main' into custipw

c1b7f52

neuralsorcerer requested a review from talgalili November 26, 2025 20:25

talgalili requested a review from Copilot November 26, 2025 20:30

Copilot AI reviewed Nov 26, 2025

View reviewed changes

balance/weighting_methods/ipw.py Show resolved Hide resolved

balance/cli.py Outdated Show resolved Hide resolved

tests/test_ipw.py Show resolved Hide resolved

talgalili requested changes Nov 26, 2025

View reviewed changes

Update as per suggestions

48723f3

neuralsorcerer requested a review from talgalili November 27, 2025 06:49

talgalili approved these changes Nov 27, 2025

View reviewed changes

talgalili mentioned this pull request Dec 1, 2025

[FEATURE] balance 0.13.0 - missing steps #175

Closed

meta-codesync bot closed this in 4e22220 Dec 1, 2025

facebook-github-bot added the Merged label Dec 1, 2025

neuralsorcerer deleted the custipw branch December 2, 2025 07:46

talgalili mentioned this pull request Dec 2, 2025

[FEATURE] Support more models from sklearn (other than logistic regression) #139

Closed

	logger.warning("penalty_factor is ignored when using a custom model.")
	logger.warning("penalty_factor is only supported with the default logistic regression model and will be ignored when using a custom classifier.")

	"model must be 'sklearn', an sklearn classifier implementing predict_proba, or None"
	"model must be 'sklearn' (string), an sklearn classifier implementing predict_proba, or None (defaults to logistic regression)"

	"The provided custom model must implement predict_proba for propensity estimation."
	"The provided custom model must implement the predict_proba method for propensity estimation. "
	"Ensure your classifier inherits from sklearn.base.ClassifierMixin and defines predict_proba."

		def test_ipw_supports_custom_model_parameter(self) -> None:
		"""The ``model`` parameter accepts sklearn classifiers directly."""

Allow custom classifiers via ipw model parameter #177

Allow custom classifiers via ipw model parameter #177

Uh oh!

Conversation

neuralsorcerer commented Nov 26, 2025

Uh oh!

meta-codesync bot commented Nov 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

talgalili left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

Uh oh!

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

talgalili left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

facebook-github-bot commented Nov 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!