Skip to content

How to handle sparse matrices when f(0) returns not zero? #2597

@SteveBronder

Description

@SteveBronder
Collaborator

Description

I'm working on some of the testing stuff for sparse matrices and had a quick Q. For a sparse matrix, how should we handle functions where an input of 0 would return back a dense matrix? i.e. for a sparse matrix X calling acos(X) where 0 returns 1.57~. Should the return be dense or should we treat the non-filled cells as just empty and ignore them, only iterating on the cells with values in them before the operation?

Current Version:

v4.1.0

Activity

bob-carpenter

bob-carpenter commented on Oct 5, 2021

@bob-carpenter
Member

Yes, you need to treat the matrix as having 0 entries everywhere there isn't an explicit non-zero value. That means if we apply exp(), for example, the result is non-zero everywhere.

I think the bigger question is whether the result should be a sparse matrix or a dense matrix. It feels like sparse matrix makes more sense in that the vectorized return type is the same as the input type. But dense would be more efficient. My inclination is to have the result be sparse.

The right thing to do for efficiency here is compute acos(0) once and copy it rather than creating a gazillion new autodiff variables. That at least could be cooked into the definition.

I don't think we should try to get fancy and allow a non-zero value as the default for unspecified values.

SteveBronder

SteveBronder commented on Oct 5, 2021

@SteveBronder
CollaboratorAuthor

imo I'd like a dense matrix, not only because it will be more efficient but it makes the API very explicit that zeroed values will become nonzero.

nhuurre

nhuurre commented on Oct 5, 2021

@nhuurre
Collaborator
bob-carpenter

bob-carpenter commented on Oct 5, 2021

@bob-carpenter
Member

@nhuurre: The problem is that when you apply exp() to a sparse matrix, the result is dense. The question is really whether we want to auto-convert to a dense result for efficiency. @SteveBronder is suggesting a signature like this:

matrix cos(sparse_matrix);

because cos(0) != 0. But because sin(0) = 0, presumably we'd have

sparse_matrix sin(sparse_matrix);

This inconsistency is what worries me about returning a dense matrix from density inducing functions like exp or cos.

nhuurre

nhuurre commented on Oct 5, 2021

@nhuurre
Collaborator
bob-carpenter

bob-carpenter commented on Oct 5, 2021

@bob-carpenter
Member

why would you want to apply exp() to a sparse matrix?

I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

SteveBronder

SteveBronder commented on Oct 5, 2021

@SteveBronder
CollaboratorAuthor

why would you want to apply exp() to a sparse matrix?
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

Maybe this would be a good thing to bring up in the Stan meeting. I agree it's weird, but I think it's weird only because the scheme itself is kind of weird. Like the alt here is that we return back a sparse_matrix[N, M] with each cell having a value. My concern with that is that users are going to be like, "Oh I'm going to use a sparse matrix that will be very fast I have many zeros." Then they apply one of these non-zeroing functions and suddenly they're using ~3 times the amount of memory and not getting any speedup. imo returning a dense matrix is sort of a red flag for the user like, "This function will blow up your sparsity structure"

bob-carpenter

bob-carpenter commented on Oct 6, 2021

@bob-carpenter
Member

I understand the reasons to return a dense matrix. Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

nhuurre

nhuurre commented on Oct 6, 2021

@nhuurre
Collaborator
SteveBronder

SteveBronder commented on Oct 6, 2021

@SteveBronder
CollaboratorAuthor

I actually posted this Q to the Eigen discorse server (a chat server Eigen devs and users talk on). They are also saying no no to my dense matrix idea as it sounds too weird so I'm fine with tossing that out.

Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

Yes, come to think of it, that would make most sense.

I'm okay with this, though feel a little weird to have like atan() but not acos(). Also what do we want to do for sparse_matrix<lower=0>[...] that requires using log()/exp()?

nhuurre

nhuurre commented on Oct 7, 2021

@nhuurre
Collaborator

what do we want to do for sparse_matrix<lower=0>[...]

A dense return type is definitely wrong here! The constraining transform must apply only to nonzero elements.
lb_constrain() is not exp(), any relation between them is an implementation detail and can change.
Notice that the unconstrained type corresponding to a Cholesky factor matrix is a vector. I expect the sparse matrix type also turn into a small vector when unconstrained in which case the lower bound constrain can be applied to its vector form.

Another question is do we allow sparse_matrix<lower=1>[...] even though it seems to be saying that the "sparse" matrix cannot contain any zeros? Or maybe the bounds check ignores structural zeros?

bob-carpenter

bob-carpenter commented on Oct 7, 2021

@bob-carpenter
Member

Good points, @nhuurre.

I agree about exp() being an implementation detail for <lower = 0>. We were just talking about this today in the Stan meeting and it's come up in other issues.

All of our unconstrained types are dense vectors and they'll presumably stay that way. They eventually have to concatenate into the argument to the log density function.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

SteveBronder

SteveBronder commented on Oct 7, 2021

@SteveBronder
CollaboratorAuthor

I think for now in #2599 I'm going to go with Bob's route and return back a sparse matrix with full NxM size.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

Maybe just not having constraints? The only one that makes sense to me is sparse_matrix<offset=other_sparse_mat, multiplier=other_sparse_mat2>

bob-carpenter

bob-carpenter commented on Oct 7, 2021

@bob-carpenter
Member

With no constraints, there's no problem with accidentally removing sparsity. Other constraints are consistent with sparsity, like <lower = -1, upper = 1>.

We'll eventually want to have something like a sparse Cholesky factor or sparse covariance matrix and then implement log determinants, and multi-normal on the precision scale (where there's some hope to preserve sparsity).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bob-carpenter@SteveBronder@nhuurre

        Issue actions

          How to handle sparse matrices when f(0) returns not zero? · Issue #2597 · stan-dev/math