Search code, repositories, users, issues, pull requests...

opened

on Oct 5, 2021

Collaborator

Description

I'm working on some of the testing stuff for sparse matrices and had a quick Q. For a sparse matrix, how should we handle functions where an input of 0 would return back a dense matrix? i.e. for a sparse matrix X calling acos(X) where 0 returns 1.57~. Should the return be dense or should we treat the non-filled cells as just empty and ignore them, only iterating on the cells with values in them before the operation?

Current Version:

v4.1.0

Member

Yes, you need to treat the matrix as having 0 entries everywhere there isn't an explicit non-zero value. That means if we apply exp(), for example, the result is non-zero everywhere.

I think the bigger question is whether the result should be a sparse matrix or a dense matrix. It feels like sparse matrix makes more sense in that the vectorized return type is the same as the input type. But dense would be more efficient. My inclination is to have the result be sparse.

The right thing to do for efficiency here is compute acos(0) once and copy it rather than creating a gazillion new autodiff variables. That at least could be cooked into the definition.

I don't think we should try to get fancy and allow a non-zero value as the default for unspecified values.

CollaboratorAuthor

imo I'd like a dense matrix, not only because it will be more efficient but it makes the API very explicit that zeroed values will become nonzero.

Collaborator

AFAIK sparse matrices are most useful when dense representation would inefficient. Automatically converting to a dense matrix seems like a footgun to me. What's the purpose of supporting such "sparsity destroying" operations in the first place?

Member

@nhuurre: The problem is that when you apply exp() to a sparse matrix, the result is dense. The question is really whether we want to auto-convert to a dense result for efficiency. @SteveBronder is suggesting a signature like this:

matrix cos(sparse_matrix);

because cos(0) != 0. But because sin(0) = 0, presumably we'd have

sparse_matrix sin(sparse_matrix);

This inconsistency is what worries me about returning a dense matrix from density inducing functions like exp or cos.

Collaborator

The problem is that when you apply exp() to a sparse matrix, the result is dense.

And my question is: if applying exp() to a sparse matrix defeats the point of using sparse matrices, why would you want to apply exp() to a sparse matrix?

Member

why would you want to apply exp() to a sparse matrix?

I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

CollaboratorAuthor

why would you want to apply exp() to a sparse matrix?
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

Maybe this would be a good thing to bring up in the Stan meeting. I agree it's weird, but I think it's weird only because the scheme itself is kind of weird. Like the alt here is that we return back a sparse_matrix[N, M] with each cell having a value. My concern with that is that users are going to be like, "Oh I'm going to use a sparse matrix that will be very fast I have many zeros." Then they apply one of these non-zeroing functions and suddenly they're using ~3 times the amount of memory and not getting any speedup. imo returning a dense matrix is sort of a red flag for the user like, "This function will blow up your sparsity structure"

mentioned this

on Oct 5, 2021

[WIP] Adds specializations for handling sparse matrix unary operations #2599

Member

I understand the reasons to return a dense matrix. Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

Collaborator

Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

Yes, come to think of it, that would make most sense.

CollaboratorAuthor

I actually posted this Q to the Eigen discorse server (a chat server Eigen devs and users talk on). They are also saying no no to my dense matrix idea as it sounds too weird so I'm fine with tossing that out.

Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

Yes, come to think of it, that would make most sense.

I'm okay with this, though feel a little weird to have like atan() but not acos(). Also what do we want to do for sparse_matrix<lower=0>[...] that requires using log()/exp()?

Collaborator

what do we want to do for sparse_matrix<lower=0>[...]

A dense return type is definitely wrong here! The constraining transform must apply only to nonzero elements.
lb_constrain() is not exp(), any relation between them is an implementation detail and can change.
Notice that the unconstrained type corresponding to a Cholesky factor matrix is a vector. I expect the sparse matrix type also turn into a small vector when unconstrained in which case the lower bound constrain can be applied to its vector form.

Another question is do we allow sparse_matrix<lower=1>[...] even though it seems to be saying that the "sparse" matrix cannot contain any zeros? Or maybe the bounds check ignores structural zeros?

Member

Good points, @nhuurre.

I agree about exp() being an implementation detail for <lower = 0>. We were just talking about this today in the Stan meeting and it's come up in other issues.

All of our unconstrained types are dense vectors and they'll presumably stay that way. They eventually have to concatenate into the argument to the log density function.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

CollaboratorAuthor

I think for now in #2599 I'm going to go with Bob's route and return back a sparse matrix with full NxM size.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

Maybe just not having constraints? The only one that makes sense to me is sparse_matrix<offset=other_sparse_mat, multiplier=other_sparse_mat2>