Skip to content

Conversation

GregoryComer
Copy link
Member

@GregoryComer GregoryComer commented Jul 24, 2025

Report various error statistics for the test outputs, including SQNR, mean absolute error (MAE), and L2 norm. These are saved in the detail report per test case.

As an example, here is the output from Core ML running MobileNet V2 (roughly formatted from csv -> sheets -> markdown):

Output 0 Error Max	Output 0 Error MAE	Output 0 Error MSD	Output 0 Error L2	Output 0 SQNR
0.0005887411535		0.0001199183663		2.32E-06		0.004750485188		41.28595734

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
GregoryComer added a commit that referenced this pull request Jul 24, 2025
ghstack-source-id: dd1b232
ghstack-comment-id: 3112003831
Pull-Request: #12809
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
GregoryComer added a commit that referenced this pull request Jul 24, 2025
ghstack-source-id: 63819cb
ghstack-comment-id: 3112003831
Pull-Request: #12809
GregoryComer added a commit to GregoryComer/executorch that referenced this pull request Jul 31, 2025
ghstack-source-id: 63819cb
ghstack-comment-id: 3112003831
Pull-Request: pytorch#12809
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
actual = actual.to(torch.float64)
reference = reference.to(torch.float64)

if actual.shape != reference.shape:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some cases where in-place ops get functionalized and alter the graph outputs. This is an issue with the defunctionalization logic in ET (there's a separate issue I filed). It's not technically the backend's fault but is a real issue. I should probably just disable the affected tests for now and treat this as an error. I'll do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #13336 as a follow-up. Will stack this change on top.

Base automatically changed from gh/GregoryComer/88/head to main August 12, 2025 16:42
[ghstack-poisoned]
@GregoryComer GregoryComer added the release notes: none Do not include this in the release notes label Aug 12, 2025
@GregoryComer GregoryComer merged commit 010b800 into main Aug 12, 2025
102 of 103 checks passed
@GregoryComer GregoryComer deleted the gh/GregoryComer/91/head branch August 12, 2025 19:02
agrima1304 pushed a commit to agrima1304/executorch that referenced this pull request Aug 26, 2025
Report various error statistics for the test outputs, including SQNR,
mean absolute error (MAE), and L2 norm. These are saved in the detail
report per test case.

As an example, here is the output from Core ML running MobileNet V2
(roughly formatted from csv -> sheets -> markdown):
```
Output 0 Error Max	Output 0 Error MAE	Output 0 Error MSD	Output 0 Error L2	Output 0 SQNR
0.0005887411535		0.0001199183663		2.32E-06		0.004750485188		41.28595734
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: none Do not include this in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants