Skip to content

Conversation

@neuralsorcerer
Copy link
Collaborator

@meta-cla meta-cla bot added the cla signed label Nov 26, 2025
@talgalili talgalili requested a review from Copilot November 26, 2025 14:43
@meta-codesync
Copy link

meta-codesync bot commented Nov 26, 2025

@talgalili has imported this pull request. If you are a Meta employee, you can view this in D87926504.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the ipw() function to accept custom sklearn classifiers directly via the model parameter, replacing the previous sklearn_model parameter. Users can now pass any sklearn classifier implementing fit and predict_proba (e.g., RandomForestClassifier) or use the default logistic regression by specifying model="sklearn".

Key Changes:

  • The model parameter now accepts sklearn classifiers in addition to string identifiers
  • The sklearn_model parameter is deprecated in favor of model
  • Added comprehensive test coverage for the new parameter behavior
  • Created a new tutorial notebook demonstrating both default and custom classifier usage

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
balance/weighting_methods/ipw.py Updated ipw() signature and logic to accept classifiers via model parameter, deprecated sklearn_model, improved error messages and documentation
tests/test_ipw.py Added tests for new model parameter, conflicting arguments validation, and error handling
tutorials/balance_quickstart_ipw.ipynb New tutorial demonstrating IPW with default logistic regression and custom RandomForestClassifier
CHANGELOG.md Updated documentation to reflect the API change from sklearn_model to model parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Please see my comments

Defaults to None.
Examples:
>>> from sklearn.ensemble import RandomForestClassifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to add an example that uses the simulated data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 2a8d6d6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Notice that in the tutorials, you get the processed output of the code command. In the examples, they are not executed on the website - so it's worth adding here also the output.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

@neuralsorcerer neuralsorcerer marked this pull request as draft November 26, 2025 15:44
@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

@neuralsorcerer neuralsorcerer marked this pull request as ready for review November 26, 2025 15:54
@talgalili talgalili requested a review from Copilot November 26, 2025 17:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

logger.warning(
"penalty_factor is ignored when using a custom sklearn_model."
)
logger.warning("penalty_factor is ignored when using a custom model.")
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning message should clarify that penalty_factor is only supported for the default logistic regression. Consider: 'penalty_factor is only supported with the default logistic regression model and will be ignored when using a custom classifier.'

Suggested change
logger.warning("penalty_factor is ignored when using a custom model.")
logger.warning("penalty_factor is only supported with the default logistic regression model and will be ignored when using a custom classifier.")

Copilot uses AI. Check for mistakes.
model_name = model
else:
raise TypeError(
"model must be 'sklearn', an sklearn classifier implementing predict_proba, or None"
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message mentions None as a valid option, but None is effectively treated as 'sklearn'. Consider clarifying: 'model must be "sklearn" (string), an sklearn classifier implementing predict_proba, or None (defaults to logistic regression)'

Suggested change
"model must be 'sklearn', an sklearn classifier implementing predict_proba, or None"
"model must be 'sklearn' (string), an sklearn classifier implementing predict_proba, or None (defaults to logistic regression)"

Copilot uses AI. Check for mistakes.
if not hasattr(custom_model, "predict_proba"):
raise ValueError(
"The provided sklearn_model must implement predict_proba for propensity estimation."
"The provided custom model must implement predict_proba for propensity estimation."
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message could be more actionable. Consider: 'The provided custom model must implement the predict_proba method for propensity estimation. Ensure your classifier inherits from sklearn.base.ClassifierMixin and defines predict_proba.'

Suggested change
"The provided custom model must implement predict_proba for propensity estimation."
"The provided custom model must implement the predict_proba method for propensity estimation. "
"Ensure your classifier inherits from sklearn.base.ClassifierMixin and defines predict_proba."

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +263
def test_ipw_supports_custom_model_parameter(self) -> None:
"""The ``model`` parameter accepts sklearn classifiers directly."""
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test duplicates the coverage already provided by test_ipw_supports_custom_sklearn_model. Consider removing this test or expanding it to verify distinct behavior not covered by the existing test, such as testing with a different classifier or edge case.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great update. I made a bunch of comments - please review.

Defaults to None.
Examples:
>>> from sklearn.ensemble import RandomForestClassifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Notice that in the tutorials, you get the processed output of the code command. In the examples, they are not executed on the website - so it's worth adding here also the output.

using_default_logistic = sklearn_model is None
using_default_logistic = custom_model is None

if using_default_logistic:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to keep this as a variable. Just use
custom_model is None
And add as a comment # using_default_logistic

return self.args.weight_trimming_mean_ratio

def logistic_regression_kwargs(self) -> Dict[str, Any] | None:
raw_kwargs = self.args.ipw_logistic_regression_kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no, please don't remove this. It's a great addition!
Just change it so that it works and uses the ipw_logistic_regression_kwargs as input to train a logisticregression model inside this function.

@@ -0,0 +1,495 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since
tutorials/balance_quickstart_ipw.ipynb
is identical to
tutorials/balance_quickstart.ipynb, just with a few more examples - I suggest you just add them to
tutorials/balance_quickstart.ipynb

- Added `logistic_regression_kwargs` parameter to `ipw()` for customizing
sklearn LogisticRegression settings
([#138](https://github.com/facebookresearch/balance/pull/138)).
- CLI now supports `--ipw_logistic_regression_kwargs` for passing custom
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written above - I think keeping this in the CLI is a good idea.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copilot suggestions are good - please address them.
There are some linter issues from meta, but these are not things for you to fix (they deal with internal files). So I'll deal with that before landing.

Since there are no github workflows yet, I'll also update if there are any test failures (but I'll know only a bit later once it finishes running internally).

p.s.: thanks for all the commits, very cool work (and I have more ideas for stuff to do moving forward - but let's close 0.13.0 first)

@facebook-github-bot
Copy link
Contributor

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: this works seems good, THANKS @neuralsorcerer !

I'll pull this and make some changes (some internal linter issues, and also I'll update the examples to work).

Since this is a 'heavy' PR, it might take my other metamates friends a few days to review/accept/land.

@neuralsorcerer
Copy link
Collaborator Author

Thank you for all the help @talgalili :)

@talgalili
Copy link
Contributor

FYI:
@neuralsorcerer due to thanks-givings, my other friends/colleages are away.
This diff will be reviewed by them next week.
Have a great weekend.

@neuralsorcerer
Copy link
Collaborator Author

Happy weekend bro :)

@meta-codesync
Copy link

meta-codesync bot commented Dec 1, 2025

@talgalili merged this pull request in 4e22220.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants