Skip to content

The forward_feature method has an explosive number of values(for depths=6 and in the latter layers). #180

@XDUcy

Description

@XDUcy

I tried outputting the output of each RSTB block and found that it wasn't scaled correctly, resulting in severe numerical explosion. At float16 precision, it even caused an overflow error; it only worked correctly after changing to float32. I'm using the officially released weights 001_classicalSR_DF2K_s64w8_SwinIR-M_x2.pth, but everything works fine in the test metrics. Is this inherent to the model? I can't add extra numerical operations inside the loop because this would disrupt the learned distribution.

Image

This is an interesting issue. I just found out that the reason behind fp16 instability is that x (shown below) is getting out of the range of fp16 during inference. Try to debug this piece of code - in main class SwinIR. If you look at x.std() during the iteration throughout self.layers, you will find that std is increasing exponentially. After 6 iterations, it is beyond the range of fp16 and gets converted into NANs. This is then clipped to be 1.0 at the end, and hence your black screen. @JingyunLiang , should this be the case here? I see that even when using fp32, the x.std() is increasing exponentially during looping.

    def forward_features(self, x):

        x_size = (x.shape[2], x.shape[3])

        x = self.patch_embed(x)

        if self.ape:

            x = x + self.absolute_pos_embed

        x = self.pos_drop(x)



        for layer in self.layers:

            x = layer(x, x_size) # Here! It explodes on fp16 after reaching `x.std() > 52000`



        x = self.norm(x)  # B L C

        x = self.patch_unembed(x, x_size)



        return x

Originally posted by @tornikeo in #103

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions