Skip to content

Scale transform changes aggregate results #1028

@MSHelm

Description

@MSHelm

Hey everyone,
as mentioned on Zulip I am using macsima data, where the macsima reader by default adds the Scale transformation to the coordinate system. I noticed that when I remove this my aggregation results are slightly different. This is most prominent for small signals, for larger signals it is not so pronounced. My original use case was for the mean intensity. For sum its very obvious that the absolute values change due to the transform, but the ratios stay similar.

Of course I understand that it makes sense to apply the transform before doing the aggregation, since they are intended for example for aligning my labels with the image. Still it was a bit unintuitive for me that this happens as a new user. Maybe this could be highlighted also a bit more in the tutorials. As requested below is a small example.

import spatialdata as sd
from spatialdata.datasets import blobs
from spatialdata.transformations import set_transformation, Scale

sdata = blobs()
# set scale transformation to new coordinate system. global coord system has Identity transform
scale_transform = Scale([0.1, 0.1], ("x", "y"))
set_transformation(sdata["blobs_image"], transformation=scale_transform, to_coordinate_system="scaled")
set_transformation(sdata["blobs_labels"], transformation=scale_transform, to_coordinate_system="scaled")

global_sum = sdata.aggregate(values="blobs_image", by="blobs_labels", target_coordinate_system="global")

print(global_sum["table"].X[:1])
#<Compressed Sparse Row sparse matrix of dtype 'float64'
#	with 3 stored elements and shape (1, 3)>
#  Coords	Values
#  (0, 0)	1309.3692551660652
#  (0, 1)	1587.8641823936478
#  (0, 2)	3125.1190857645483

scaled_sum = sdata.aggregate(values="blobs_image", by="blobs_labels", target_coordinate_system="scaled")

print(scaled_sum["table"].X[:1])
#<Compressed Sparse Row sparse matrix of dtype 'float64'
#	with 3 stored elements and shape (1, 3)>
#  Coords	Values
#  (0, 0)	12.758250581405642
#  (0, 1)	15.427204091295
#  (0, 2)	31.23271691509622


global_mean = sdata.aggregate(values="blobs_image", by="blobs_labels", agg_func="mean", target_coordinate_system="global")

print(global_mean["table"].X[:1])
#<Compressed Sparse Row sparse matrix of dtype 'float64'
#	with 3 stored elements and shape (1, 3)>
#  Coords	Values
#  (0, 0)	0.08696083251418378
#  (0, 1)	0.10545687603066001
#  (0, 2)	0.20755257260839133

scaled_mean = sdata.aggregate(values="blobs_image", by="blobs_labels", agg_func="mean", target_coordinate_system="scaled")

print(scaled_mean["table"].X[:1])
#<Compressed Sparse Row sparse matrix of dtype 'float64'
#	with 3 stored elements and shape (1, 3)>
#  Coords	Values
#  (0, 0)	0.08449172570467313
#  (0, 1)	0.10216691451188742
#  (0, 2)	0.2068391848681869

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions