Skip to content

About NoamLR #7

@ShaneTian

Description

@ShaneTian

def get_lr(self):
last_epoch = max(1, self.last_epoch)
scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5))
return [base_lr * scale for base_lr in self.base_lrs]

The custom NoamLR results in same LR in the first two steps, like:

warmup_steps = 10
lr = 0.01
  • step -- before (current step LR) -- after (next step LR)
  • 0 -- 0.001 -- 0.001
  • 1 -- 0.001 -- 0.002
  • 2 -- 0.002 -- 0.003

There are two ways to fix this:

  • last_epoch = self.last_epoch + 1, like ESPnet
    def get_lr(self): 
        last_epoch = self.last_epoch + 1
        scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5)) 
        return [base_lr * scale for base_lr in self.base_lrs] 
  • use LambdaLR directly
    noam_scale = lambda epoch: (warmup_steps ** 0.5) * min((epoch + 1) ** -0.5, (epoch + 1) * (warmup_steps ** -1.5))
    scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=noam_scale)

Of course, the above two approaches are equivalent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions