shape incompatibility when training SAE

Hello!

I was trying to use the scripts to train my own SAE model (examples/train_basic_sae.py or examples/train_multiple_sae_architectures.py --architecture fidelity), but had a few errors, possibly due to package inconsistencies. I have transformers==5.3.0 and nnsight==0.5.8 which seems fine according to the the requirements file. However, the larger issue comes in get_esm_output_with_intervention, where submodule.input seems to be of shape (batch, seq_len, hidden), while hidden_state_override is only (hidden). I've made this change, but I'm entirely sure that it's correct. Please let me know!

```
embd_to_patch = (
    # submodule.input[0][0]
    submodule.input
    if input_or_output == "input"
    else submodule.output
)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shape incompatibility when training SAE #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

shape incompatibility when training SAE #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions