[BUG] Passing 'groups' parameter to rapids_singlecell.tl.rank_genes_groups gives all 0 scores/NA LFCs

**Describe the bug**

When calling `rsc.tl.rank_genes_groups` with the `groups=[...]` parameter and `method='t-test'` or `method='wilcoxon'`, the resulting scores, p-values, and log fold changes are all zero or NaN (scores are okay for wilcoxon). Calling the same function on the same AnnData without the `groups=` parameter (computing all groups) returns correct, non-degenerate values for every group including the one that was previously broken. The `method='wilcoxon'` path is partially affected: scores compute correctly with `groups=`, but `pvals`, `pvals_adj`, and `logfoldchanges` are still all zero / NaN.

scanpy's `sc.tl.rank_genes_groups` on the identical AnnData with the same `groupby` and `groups=` arguments returns correct values, so the data itself is valid.

**Steps/Code to reproduce bug**

```python
import cupy as cp
import cupyx.scipy.sparse as cusp
import anndata as ad
import numpy as np
import pandas as pd
import rapids_singlecell as rsc
import scanpy as sc

n_cells, n_genes = 5000, 1000
X = cusp.random(n_cells, n_genes, density=0.1, dtype=cp.float32, format='csr')
obs = pd.DataFrame({
    'group': pd.Categorical(np.random.choice(list('ABCDE'), n_cells))
})
adata = ad.AnnData(X=X, obs=obs)
rsc.pp.normalize_total(adata)
rsc.pp.log1p(adata)

# Works: scores, pvals, lfc all populated correctly for every group
rsc.tl.rank_genes_groups(adata, groupby='group', method='t-test', use_raw=False)
print("All groups (no `groups=`):")
print(pd.DataFrame(adata.uns['rank_genes_groups']['scores']).head())
print(pd.DataFrame(adata.uns['rank_genes_groups']['pvals']).head())
print(pd.DataFrame(adata.uns['rank_genes_groups']['logfoldchanges']).head())

# Broken: scores all 0, pvals all 0, logfoldchanges all NaN
rsc.tl.rank_genes_groups(adata, groupby='group', groups=['A'],
                         method='t-test', use_raw=False)
print("\nWith `groups=['A']`:")
print(pd.DataFrame(adata.uns['rank_genes_groups']['scores']).head())
print(pd.DataFrame(adata.uns['rank_genes_groups']['pvals']).head())
print(pd.DataFrame(adata.uns['rank_genes_groups']['logfoldchanges']).head())

# scanpy's equivalent on the same data works fine for comparison
adata_cpu = adata.copy()
rsc.get.anndata_to_CPU(adata_cpu)
sc.tl.rank_genes_groups(adata_cpu, groupby='group', groups=['A'],
                        method='t-test', use_raw=False)
print("\nscanpy with `groups=['A']`:")
print(pd.DataFrame(adata_cpu.uns['rank_genes_groups']['scores']).head())
```

**Expected behavior**

Calling `rsc.tl.rank_genes_groups` with `groups=[X]` should produce the same scores, p-values, adjusted p-values, and log fold changes for group `X` as calling it without the `groups=` argument and indexing into the result for group `X`. Currently the two code paths disagree:

- Without `groups=`: correct values for all groups.
- With `groups=[X]`: scores 0, pvals 0, logfoldchanges NaN (t-test); scores correct but pvals/lfc still 0/NaN (wilcoxon).

**Environment details (please complete the following information):**

 - Environment location: Cloud (Lambda Labs / Lambda Cloud GPU VM)
 - Linux Distro/Architecture: Ubuntu 22.04 amd64
 - GPU Model/Driver: NVIDIA H100 PCIe/Driver Version: 580.105.08
 - CUDA: cu13
 - Method of Rapids install: pip
   - `pip install 'rapids-singlecell[rapids-cu12]' --extra-index-url=https://pypi.nvidia.com`
   - Also tried `pip install 'rapids-singlecell[rapids-cu13]' --extra-index-url=https://pypi.nvidia.com` which gives a similar output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Passing 'groups' parameter to rapids_singlecell.tl.rank_genes_groups gives all 0 scores/NA LFCs #650

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Passing 'groups' parameter to rapids_singlecell.tl.rank_genes_groups gives all 0 scores/NA LFCs #650

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions