Skip to content

Bug: Cannot use boolean GCXS array for indexing another GCXS array #860

Open
@ilia-kats

Description

@ilia-kats

sparse version checks

  • I checked that this issue has not been reported before list of issues.

  • I have confirmed this bug exists on the latest version of sparse.

  • I have confirmed this bug exists on the main branch of sparse.

Describe the bug

Trying to use a boolean GCXS array to index another GCXS array with the same dimensions results in either IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices (when attempting to get values at the index) or ValueError: <class 'int'> can be computed for one-element arrays only. (when attempting to set values at the index). The setting operation is requred by the Dask nanvar implementation. This is currently preventing me from using sparse with Dask.

Steps or code to reproduce the bug

minimal reprex:

import sparse
import numpy as np
test = sparse.GCXS(np.random.rand(5,5), compressed_axes=(0,))

test[test < 0] # IndexError

test[test < 0] = np.nan # ValueError

Expected results

No error is thrown and the operation succeeds.

Actual results

For getting values:

File /data/ilia/envs/famo/lib/python3.11/site-packages/sparse/numba_backend/_compressed/indexing.py:29, in getitem(x, key)
     26         return result
     27     return GCXS.from_coo(result)
---> 29 key = list(normalize_index(key, x.shape))
     31 # zip_longest so things like x[..., None] are picked up.
     32 if len(key) != 0 and all(isinstance(k, slice) and k == slice(0, dim, 1) for k, dim in zip_longest(key, x.shape)):

File /data/ilia/envs/famo/lib/python3.11/site-packages/sparse/numba_backend/_slicing.py:58, in normalize_index(idx, shape)
     56 for i, d in zip(idx, none_shape, strict=True):
     57     if d is not None:
---> 58         check_index(i, d)
     59 idx = tuple(map(sanitize_index, idx))
     60 idx = tuple(map(replace_none, idx, none_shape))

File /data/ilia/envs/famo/lib/python3.11/site-packages/sparse/numba_backend/_slicing.py:122, in check_index(ind, dimension)
    120     return
    121 elif not isinstance(ind, Integral):
--> 122     raise IndexError(
    123         "only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and "
    124         "integer or boolean arrays are valid indices"
    125     )
    127 elif ind >= dimension:
    128     raise IndexError(f"Index is not smaller than dimension {ind:d} >= {dimension:d}")

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

For setting values:

File /data/ilia/envs/famo/lib/python3.11/site-packages/sparse/numba_backend/_sparse_array.py:978, in SparseArray.__index__(self)
    976 def __index__(self):
    977     """ """
--> 978     return self._to_scalar(int)

File /data/ilia/envs/famo/lib/python3.11/site-packages/sparse/numba_backend/_sparse_array.py:986, in SparseArray._to_scalar(self, builtin)
    984 def _to_scalar(self, builtin):
    985     if self.size != 1 or self.shape != ():
--> 986         raise ValueError(f"{builtin} can be computed for one-element arrays only.")
    987     return builtin(self.todense().flatten()[0])

ValueError: <class 'int'> can be computed for one-element arrays only.

Please describe your system.

  1. OS and version: Debian 12
  2. sparse version 0.16.0
  3. NumPy version 2.0.2
  4. Numba version 0.60.0

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementIndicates new feature requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions