Skip to content

Timm doesnt work well with AutoModelForImageClassification, num_labels doesn't work as expected. #42312

@lucian-student

Description

@lucian-student

System Info

'4.57.1'

RuntimeError: Error(s) in loading state_dict for Linear:
	size mismatch for bias: copying a param with shape torch.Size([1000]) from checkpoint, the shape in current model is torch.Size([6]).

Who can help?

Practically anybody, probably an easy fix.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

model = AutoModelForImageClassification.from_pretrained("timm/efficientvit_b3.r256_in1k",num_labels=6)

Expected behavior

Whenever i pass num_labels i get this since effiicient_net was trained using 1000 labels.

timm.create_model works fine, so I assume timm integration isn't refined.

Expected behaviour that chechkpoint will be loaded using timm. Not other means.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions