You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on some of the testing stuff for sparse matrices and had a quick Q. For a sparse matrix, how should we handle functions where an input of 0 would return back a dense matrix? i.e. for a sparse matrix X calling acos(X) where 0 returns 1.57~. Should the return be dense or should we treat the non-filled cells as just empty and ignore them, only iterating on the cells with values in them before the operation?
Yes, you need to treat the matrix as having 0 entries everywhere there isn't an explicit non-zero value. That means if we apply exp(), for example, the result is non-zero everywhere.
I think the bigger question is whether the result should be a sparse matrix or a dense matrix. It feels like sparse matrix makes more sense in that the vectorized return type is the same as the input type. But dense would be more efficient. My inclination is to have the result be sparse.
The right thing to do for efficiency here is compute acos(0) once and copy it rather than creating a gazillion new autodiff variables. That at least could be cooked into the definition.
I don't think we should try to get fancy and allow a non-zero value as the default for unspecified values.
AFAIK sparse matrices are most useful when dense representation would inefficient. Automatically converting to a dense matrix seems like a footgun to me. What's the purpose of supporting such "sparsity destroying" operations in the first place?
@nhuurre: The problem is that when you apply exp() to a sparse matrix, the result is dense. The question is really whether we want to auto-convert to a dense result for efficiency. @SteveBronder is suggesting a signature like this:
matrix cos(sparse_matrix);
because cos(0) != 0. But because sin(0) = 0, presumably we'd have
sparse_matrix sin(sparse_matrix);
This inconsistency is what worries me about returning a dense matrix from density inducing functions like exp or cos.
The problem is that when you apply exp() to a sparse matrix, the result is dense.
And my question is: if applying exp() to a sparse matrix defeats the point of using sparse matrices, why would you want to apply exp() to a sparse matrix?
why would you want to apply exp() to a sparse matrix?
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.
My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.
why would you want to apply exp() to a sparse matrix?
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.
My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.
Maybe this would be a good thing to bring up in the Stan meeting. I agree it's weird, but I think it's weird only because the scheme itself is kind of weird. Like the alt here is that we return back a sparse_matrix[N, M] with each cell having a value. My concern with that is that users are going to be like, "Oh I'm going to use a sparse matrix that will be very fast I have many zeros." Then they apply one of these non-zeroing functions and suddenly they're using ~3 times the amount of memory and not getting any speedup. imo returning a dense matrix is sort of a red flag for the user like, "This function will blow up your sparsity structure"
I understand the reasons to return a dense matrix. Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.
I actually posted this Q to the Eigen discorse server (a chat server Eigen devs and users talk on). They are also saying no no to my dense matrix idea as it sounds too weird so I'm fine with tossing that out.
Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.
Yes, come to think of it, that would make most sense.
I'm okay with this, though feel a little weird to have like atan() but not acos(). Also what do we want to do for sparse_matrix<lower=0>[...] that requires using log()/exp()?
what do we want to do for sparse_matrix<lower=0>[...]
A dense return type is definitely wrong here! The constraining transform must apply only to nonzero elements. lb_constrain() is not exp(), any relation between them is an implementation detail and can change.
Notice that the unconstrained type corresponding to a Cholesky factor matrix is a vector. I expect the sparse matrix type also turn into a small vector when unconstrained in which case the lower bound constrain can be applied to its vector form.
Another question is do we allow sparse_matrix<lower=1>[...] even though it seems to be saying that the "sparse" matrix cannot contain any zeros? Or maybe the bounds check ignores structural zeros?
I agree about exp() being an implementation detail for <lower = 0>. We were just talking about this today in the Stan meeting and it's come up in other issues.
All of our unconstrained types are dense vectors and they'll presumably stay that way. They eventually have to concatenate into the argument to the log density function.
I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.
I think for now in #2599 I'm going to go with Bob's route and return back a sparse matrix with full NxM size.
I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.
Maybe just not having constraints? The only one that makes sense to me is sparse_matrix<offset=other_sparse_mat, multiplier=other_sparse_mat2>
With no constraints, there's no problem with accidentally removing sparsity. Other constraints are consistent with sparsity, like <lower = -1, upper = 1>.
We'll eventually want to have something like a sparse Cholesky factor or sparse covariance matrix and then implement log determinants, and multi-normal on the precision scale (where there's some hope to preserve sparsity).
Activity
bob-carpenter commentedon Oct 5, 2021
Yes, you need to treat the matrix as having 0 entries everywhere there isn't an explicit non-zero value. That means if we apply
exp()
, for example, the result is non-zero everywhere.I think the bigger question is whether the result should be a sparse matrix or a dense matrix. It feels like sparse matrix makes more sense in that the vectorized return type is the same as the input type. But dense would be more efficient. My inclination is to have the result be sparse.
The right thing to do for efficiency here is compute
acos(0)
once and copy it rather than creating a gazillion new autodiff variables. That at least could be cooked into the definition.I don't think we should try to get fancy and allow a non-zero value as the default for unspecified values.
SteveBronder commentedon Oct 5, 2021
imo I'd like a dense matrix, not only because it will be more efficient but it makes the API very explicit that zeroed values will become nonzero.
nhuurre commentedon Oct 5, 2021
bob-carpenter commentedon Oct 5, 2021
@nhuurre: The problem is that when you apply
exp()
to a sparse matrix, the result is dense. The question is really whether we want to auto-convert to a dense result for efficiency. @SteveBronder is suggesting a signature like this:because
cos(0) != 0
. But becausesin(0) = 0
, presumably we'd haveThis inconsistency is what worries me about returning a dense matrix from density inducing functions like
exp
orcos
.nhuurre commentedon Oct 5, 2021
bob-carpenter commentedon Oct 5, 2021
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.
My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.
SteveBronder commentedon Oct 5, 2021
Maybe this would be a good thing to bring up in the Stan meeting. I agree it's weird, but I think it's weird only because the scheme itself is kind of weird. Like the alt here is that we return back a
sparse_matrix[N, M]
with each cell having a value. My concern with that is that users are going to be like, "Oh I'm going to use a sparse matrix that will be very fast I have many zeros." Then they apply one of these non-zeroing functions and suddenly they're using ~3 times the amount of memory and not getting any speedup. imo returning a dense matrix is sort of a red flag for the user like, "This function will blow up your sparsity structure"bob-carpenter commentedon Oct 6, 2021
I understand the reasons to return a dense matrix. Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.
nhuurre commentedon Oct 6, 2021
SteveBronder commentedon Oct 6, 2021
I actually posted this Q to the Eigen discorse server (a chat server Eigen devs and users talk on). They are also saying no no to my dense matrix idea as it sounds too weird so I'm fine with tossing that out.
I'm okay with this, though feel a little weird to have like
atan()
but notacos()
. Also what do we want to do forsparse_matrix<lower=0>[...]
that requires usinglog()
/exp()
?nhuurre commentedon Oct 7, 2021
A dense return type is definitely wrong here! The constraining transform must apply only to nonzero elements.
lb_constrain()
is notexp()
, any relation between them is an implementation detail and can change.Notice that the unconstrained type corresponding to a Cholesky factor matrix is a vector. I expect the sparse matrix type also turn into a small vector when unconstrained in which case the lower bound constrain can be applied to its vector form.
Another question is do we allow
sparse_matrix<lower=1>[...]
even though it seems to be saying that the "sparse" matrix cannot contain any zeros? Or maybe the bounds check ignores structural zeros?bob-carpenter commentedon Oct 7, 2021
Good points, @nhuurre.
I agree about
exp()
being an implementation detail for<lower = 0>
. We were just talking about this today in the Stan meeting and it's come up in other issues.All of our unconstrained types are dense vectors and they'll presumably stay that way. They eventually have to concatenate into the argument to the log density function.
I don't think
sparse_matrix<lower = 1>
makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could besparse_matrix<lower = L>
whereL
is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.SteveBronder commentedon Oct 7, 2021
I think for now in #2599 I'm going to go with Bob's route and return back a sparse matrix with full NxM size.
Maybe just not having constraints? The only one that makes sense to me is
sparse_matrix<offset=other_sparse_mat, multiplier=other_sparse_mat2>
bob-carpenter commentedon Oct 7, 2021
With no constraints, there's no problem with accidentally removing sparsity. Other constraints are consistent with sparsity, like
<lower = -1, upper = 1>
.We'll eventually want to have something like a sparse Cholesky factor or sparse covariance matrix and then implement log determinants, and multi-normal on the precision scale (where there's some hope to preserve sparsity).