-
Notifications
You must be signed in to change notification settings - Fork 280
Description
🔍 Issue Description
Fix numerical instability in LogitLink.mu causing NaN predictions and model training failure for large linear predictors
📌 Issue Type
- Bug
📝 Description
In pyGAM, when using LogisticGAM or any GAM configuring a BinomialDist with a LogitLink, the inverse link function LogitLink.mu is highly susceptible to numeric overflow. It directly evaluates elp = np.exp(lp) and computes mu = dist.levels * elp / (elp + 1).
What is happening?
If the linear predictor lp becomes large (e.g., > 709, which easily occurs during transient PIRLS steps or with highly separable data), np.exp(lp) overflows to inf. Subsequently, the calculation inf / (inf + 1) evaluates to NaN (triggering an invalid value RuntimeWarning). This injects NaNs into the expected values vector (mu), corrupting the working response and weights for the optimization loop, frequently ending in a pygam.utils.OptimizationError or silently returning degraded NaN values.
What should happen instead?
The inverse link function (the standard logistic sigmoid) should be computed in a numerically stable way, safely evaluating to dist.levels (1.0 for standard probabilities) for large inputs without blowing up to NaN.
Why is this needed?
Without this numerical stability, predicting probabilities on extreme feature values or fitting models on well-separated datasets crashes the application or yields an un-fittable model. This is a critical stability blocker for classification tasks in pyGAM.
🎯 Proposed Solution (Optional but Encouraged)
In pygam/links.py, modify the LogitLink.mu method to use scipy.special.expit which is implemented for numerical stability:
python
from scipy.special import expit class LogitLink(Link): # ... def mu(self, lp, dist): return dist.levels * expit(lp)
Alternatively, using a stable numpy equivalent:
python
def mu(self, lp, dist): return dist.levels * np.where(lp >= 0, 1.0 / (1.0 + np.exp(-lp)), np.exp(lp) / (1.0 + np.exp(lp)))
Relevant modules/files: pygam/links.py (specifically LogitLink.mu)
Potential edge cases:
Handling inputs mapped exactly near mu = 0 or mu = dist.levels within the machine epsilon in subsequent .gradient() evaluations, requiring soft-clipping of mu.
📎 Additional Context
This bug significantly impacts LogisticGAM robustness compared to robust Scikit-Learn or Statsmodels logistic implementations, where developers expect standard numerical safeguards for the sigmoid function.
🙋 Claiming This Issue
To avoid duplicated work:
- I'm willing to solve this issue by myself