[BUG] Numerical instability and NaN predictions in LogitLink.mu

### 🔍 Issue Description 
Fix numerical instability in LogitLink.mu causing NaN predictions and model training failure for large linear predictors

### 📌 Issue Type
 

- [x] Bug

### 📝 Description 
In pyGAM, when using **LogisticGAM** or any GAM configuring a **BinomialDist** with a **LogitLink**, the inverse link function **LogitLink.mu** is highly susceptible to numeric overflow. It directly evaluates **elp = np.exp(lp)** and computes **mu = dist.levels * elp / (elp + 1).**

### What is happening?
 If the linear predictor **lp** becomes large (e.g., > 709, which easily occurs during transient PIRLS steps or with highly separable data), **np.exp(lp)** overflows to **inf.** Subsequently, the calculation **inf / (inf + 1)** evaluates to **NaN** (triggering an invalid value RuntimeWarning). This injects **NaNs** into the expected values vector (**mu**), corrupting the working response and weights for the optimization loop, frequently ending in a **pygam.utils.OptimizationError** or silently returning degraded **NaN** values.

### What should happen instead?
 The inverse link function (the standard logistic sigmoid) should be computed in a numerically stable way, safely evaluating to **dist.levels** (1.0 for standard probabilities) for large inputs without blowing up to **NaN**.

### Why is this needed? 
Without this numerical stability, predicting probabilities on extreme feature values or fitting models on well-separated datasets crashes the application or yields an un-fittable model. This is a critical stability blocker for classification tasks in pyGAM.

### 🎯 Proposed Solution (Optional but Encouraged) 
In **pygam/links.py**, modify the **LogitLink.mu** method to use **scipy.special.expit** which is implemented for numerical stability:

> python
> ```
> from scipy.special import expit
> class LogitLink(Link):
>     # ...
>     def mu(self, lp, dist):
>         return dist.levels * expit(lp)
> ```

Alternatively, using a stable numpy equivalent:

> python
> ```
> def mu(self, lp, dist):
>          return dist.levels * np.where(lp >= 0, 
>             1.0 / (1.0 + np.exp(-lp)), 
>             np.exp(lp) / (1.0 + np.exp(lp)))
> ```

**Relevant modules/files:** **pygam/links.py** (specifically **LogitLink.mu**)

**Potential edge cases:** 
Handling inputs mapped exactly near **mu = 0** or **mu = dist.levels** within the machine epsilon in subsequent **.gradient()** evaluations, requiring soft-clipping of **mu**.

**📎 Additional Context**
 This bug significantly impacts **LogisticGAM** robustness compared to robust Scikit-Learn or Statsmodels logistic implementations, where developers expect standard numerical safeguards for the sigmoid function.

### 🙋 Claiming This Issue 
To avoid duplicated work:

 

- [x] I'm willing to solve this issue by myself




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Numerical instability and NaN predictions in LogitLink.mu #534

🔍 Issue Description

📌 Issue Type

📝 Description

What is happening?

What should happen instead?

Why is this needed?

🎯 Proposed Solution (Optional but Encouraged)

🙋 Claiming This Issue

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[BUG] Numerical instability and NaN predictions in LogitLink.mu #534

Description

🔍 Issue Description

📌 Issue Type

📝 Description

What is happening?

What should happen instead?

Why is this needed?

🎯 Proposed Solution (Optional but Encouraged)

🙋 Claiming This Issue

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions