Skip to content

code different with paper #9

@bao21987

Description

@bao21987

the code:train/SeleKT/selekt.py calculate topk in every parameter, not global

the paper https://arxiv.org/pdf/2503.03656. 4. Robust Model Adaptation/Proposed Solution says
"we first compute dense gradients by doing full finetuning of the model θ, and the compute the top-k non-zero
entries (by magnitude) on the (accumulated) gradient vector
or the “task vector” θ − θbase. This also ensures that the
parameter selection is global
and not confined to specific layers or other heuristics employed in earlier robust finetuning strategies (Lee et al., 2023)."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions