Why predict sample, not epsilon?

This may be a concise question. Perhaps because of my shortage of knowledge, what I found in some representative papers (like the original [diffusion policy code](https://github.com/real-stanford/diffusion_policy)) is that authors always predict the original noise (epsilon), not the original sample in their models. Can I ask the reason why you choose the sample, not the epsilon, to predict? Is there any advantage to predict the sample over the epsilon and do some previous works in academics show that prediction on samples truly improve the performance? Thanks for your reply. Really appreciate to your work☺️.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why predict sample, not epsilon? #109

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Why predict sample, not epsilon? #109

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions