v0.1.0 (2025-02-12)
Feature
- feat: pypi packaging and auto-release with semantic release (
0ff8888)
Unknown
- Merge pull request #37 from chanind/pypi-package
feat: pypi packaging and auto-release with semantic release (a711efe)
-
simplify matryoshka loss (
43421f5) -
Use torch.split() instead of direct indexing for 25% speedup (
505a445) -
Fix matryoshka spelling (
aa45bf6) -
Fix incorrect auxk logging name (
784a62a) -
Add citation (
77f2690) -
Make sure to detach reconstruction before calculating aux loss (
db2b564) -
Merge pull request #36 from saprmarks/aux_loss_fixes
Aux loss fixes, standardize decoder normalization (34eefda)
-
Standardize and fix topk auxk loss implementation (
0af1971) -
Normalize decoder after optimzer step (
200ed3b) -
Remove experimental matroyshka temperature (
6c2fcfc) -
Make sure x is on the correct dtype for jumprelu when logging (
c697d0f) -
Import trainers from correct relative location for submodule use (
8363ff7) -
By default, don't normalize Gated activations during inference (
52b0c54) -
Also update context manager for matroyshka threshold (
65e7af8) -
Disable autocast for threshold tracking (
17aa5d5) -
Add torch autocast to training loop (
832f4a3) -
Save state dicts to cpu (
3c5a5cd) -
Add an option to pass LR to TopK trainers (
8316a44) -
Add April Update Standard Trainer (
cfb36ff) -
Merge pull request #35 from saprmarks/code_cleanup
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98)
-
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (
9751c57) -
Merge pull request #34 from adamkarvonen/matroyshka
Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4)
-
Add a verbose option during training (
0ff687b) -
Prevent wandb cuda multiprocessing errors (
370272a) -
Log dead features for batch top k SAEs (
936a69c) -
Log number of dead features to wandb (
77da794) -
Add trainer number to wandb name (
3b03b92) -
Add notes (
810dbb8) -
Add option to ignore bos tokens (
c2fe5b8) -
Fix jumprelu training (
ec961ac) -
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (
8eaa8b2) -
Format with ruff (
3e31571) -
Add temperature scaling to matroyshka (
ceabbc5) -
norm the correct decoder dimension (
5383603) -
Fix loading matroyshkas from_pretrained() (
764d4ac) -
Initial matroyshka implementation (
8ade55b) -
Make sure we step the learning rate scheduler (
1df47d8) -
Merge pull request #33 from saprmarks/lr_scheduling
Lr scheduling (316dbbe)
-
Properly set new parameters in end to end test (
e00fd64) -
Standardize learning rate and sparsity schedules (
a2d6c43) -
Merge pull request #32 from saprmarks/add_sparsity_warmup
Add sparsity warmup (a11670f)
-
Add sparsity warmup for trainers with a sparsity penalty (
911b958) -
Clean up lr decay (
e0db40b) -
Track lr decay implementation (
f0bb66d) -
Remove leftover variable, update expected results with standard SAE improvements (
9687bb9) -
Merge pull request #31 from saprmarks/add_demo
Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857)
-
Also scale topk thresholds when scaling biases (
efd76b1) -
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (
8b95ec9) -
Add bias scaling to topk saes (
484ca01) -
Fix topk bfloat16 dtype error (
488a154) -
Add option to normalize dataset activations (
81968f2) -
Remove demo script and graphing notebook (
57f451b) -
Track thresholds for topk and batchtopk during training (
b5821fd) -
Track threshold for batchtopk, rename for consistency (
32d198f) -
Modularize demo script (
dcc02f0) -
Begin creation of demo script (
712eb98) -
Fix JumpReLU training and loading (
552a8c2) -
Ensure activation buffer has the correct dtype (
d416eab) -
Merge pull request #30 from adamkarvonen/add_tests
Add end to end test, upgrade nnsight to support 0.3.0, fix bugs (c4eed3c)
- Merge pull request #26 from mntss/batchtokp_aux_fix
Fix BatchTopKSAE training (2ec1890)
-
Check for is_tuple to support mlp / attn submodules (
d350415) -
Change save_steps to a list of ints (
f1b9b80) -
Add early stopping in forward pass (
05fe179) -
Obtain better test results using multiple batches (
067bf7b) -
Fix frac_alive calculation, perform evaluation over multiple batches (
dc30720) -
Complete nnsight 0.2 to 0.3 changes (
807f6ef) -
Rename input to inputs per nnsight 0.3.0 (
9ed4af2) -
Add a simple end to end test (
fe54b00) -
Create LICENSE (
32fec9c) -
Fix BatchTopKSAE training (
4aea538) -
dtype for loading SAEs (
932e10a) -
Merge pull request #22 from pleask/jumprelu
Implement jumprelu training (713f638)
Use separate wandb runs for each SAE being trained (df60f52)
-
Merge branch 'main' into jumprelu (
3dfc069) -
implement jumprelu training (
16bdfd9) -
handle no wandb (
8164d32) -
Merge pull request #20 from pleask/batchtopk
Implement BatchTopK (b001fb0)
-
separate runs for each sae being trained (
7d3b127) -
add batchtopk (
f08e00b) -
Move f_gate to encoder's dtype (
43bdb3b) -
Ensure that x_hat is in correct dtype (
3376f1b) -
Preallocate buffer memory to lower peak VRAM usage when replenishing buffer (
90aff63) -
Perform logging outside of training loop to lower peak memory usage (
57f8812) -
Remove triton usage (
475fece) -
Revert to triton TopK implementation (
d94697d) -
Add relative reconstruction bias from GDM Gated SAE paper to evaluate() (
8984b01) -
git push origin main:Merge branch 'ElanaPearl-small_bug_fixes' into main (
2d586e4) -
simplifying readme (
9c46e06) -
simplify readme (
5c96003) -
add missing imports (
7f689d9) -
fix arg name in trainer_config (
9577d26) -
update sae training example code (
9374546) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
7d405f7) -
GatedSAE: moved feature re-normalization into encode (
f628c0e) -
documenting JumpReLU SAE support (
322b6c0) -
support for JumpReluAutoEncoders (
57df4e7) -
Add submodule_name to PAnnealTrainer (
ecdac03) -
host SAEs on huggingface (
0ae37fe) -
fixed batch loading in examine_dimension (
82485d7) -
Merge pull request #17 from saprmarks/collab
Merge Collab Branch (cdf8222)
-
moved experimental trainers to collab-dev (
8d1d581) -
Merge branch 'main' into collab (
dda38b9) -
Update README.md (
4d6c6a6) -
remove a sentence (
2d40ed5) -
add a list of trainers to the README (
746927a) -
add architecture details to README (
60422a8) -
make wandb integration optional (
a26c4e5) -
make wandb integration optional (
0bdc871) -
Fix tutorial 404 (
deb3df7) -
Add missing values to config (
9e44ea9) -
changed TrainerTopK class name (
c52ff00) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c04ee3b) -
fixed loss_recovered to incorporate top_k (
6be5635) -
fixed TopK loss (spotted by Anish) (
a3b71f7) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
40bcdf6) -
naming conventions (
5ff7fa1) -
small fix to triton kernel (
5d21265) -
small updates for eval (
585e820) -
added some housekeeping stuff to top_k (
5559c2c) -
add support for Top-k SAEs (
2d549d0) -
add transcoder eval (
8446f4f) -
add transcoder support (
c590a25) -
added wandb finish to trainer (
113c042) -
fixed anneal end bug (
fbd9ee4) -
added layer and lm_name (
d173235) -
adding layer and lm_name to trainer config (
6168ee0) -
make tracer_args optional (
31b2828) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
87d2b58) -
bug fix evaluating CE loss with NNsight models (
f8d81a1) -
Combining P Annealing and Anthropic Update (
44318e9) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
43e9ca6) -
removing normalization (
7a98d77) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
5f2b598) -
added buffer for NNsight models (not LanguageModel classes) as an extra class. We'll want to combine the three buffers wo currently have at some point (
f19d284) -
fixed nnsight issues model tracing for chess-gpt (
7e8c9f9) -
added W_O projection to HeadBuffer (
47bd4cd) -
added support for training SAEs on individual heads (
a0e3119) -
added support for training SAEs on individual heads (
47351b4) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
7de0bd3) -
default hyperparameter adjustments (
a09346b) -
normalization in gated_new (
104aba2) -
fixing bug where inputs can get overwritten (
93fd46e) -
fixing tuple bug (
b05dcaf) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
73b5663) -
multiple steps debugging (
de3eef1) -
adding gradient pursuit function (
72941f1) -
bugfix (
53aabc0) -
bugfix (
91691b5) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
9ce7d80) -
logging more things (
8498a75) -
changing initialization for AutoEncoderNew (
c7ee7ec) -
fixing gated SAE encoder scheme (
4084bc3) -
changes to gatedSAE API (
9e001d1) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
05b397b) -
changing initialization (
ebe0d57) -
finished combining gated and p-annealing (
4c08614) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
8e0a6f9) -
gated_anneal first steps (
ba8b8fa) -
jump SAE (
873b764) -
adapted loss logging in p_anneal (
33997c0) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
1eecbda) -
merging gated and Anthropic SAEs (
b6a24d0) -
revert trainer naming (
c0af6d9) -
restored trainer naming (
2ec3c67) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
fe7e93b) -
various changes (
32027ae) -
debug panneal (
463907d) -
debug panneal (
8c00100) -
debug panneal (
dc632cd) -
debug panneal (
166f6a9) -
debug panneal (
bcebaa6) -
debug pannealing (
446c568) -
p_annealing loss buffer (
e4d4a35) -
implement Ben's p-annealing strategy (
06a27f0) -
panneal changes (
fe4ff6f) -
logging trainer names to wandb (
f9c5e45) -
bugfixes for StandardTrainerNew (
70acd85) -
trainer for new anthropic infrastructure (
531c285) -
adding r_mag parameter to GSAE (
198ddf4) -
gatedSAE trainer (
3567d6d) -
cosmetic change (
0200976) -
GatedAutoEncoder class (
2cfc47b) -
p annealing not affected by resampling (
ad8d837) -
integrated trainer update (
c7613d3) -
Merge branch 'collab' into p_annealing (
933b80c) -
fixed p calculation (
9837a6f) -
getting rid of useless seed arguement (
377c762) -
trainer initializes SAE (
7dffb66) -
trainer initialized SAE (
6e80590) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c58d23d) -
changes to lista p_anneal trainers (
3cc6642) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
9dfd3db) -
decoupled lr warmup and p warmup in p_anneal trainer (
c3c1645) -
Merge pull request #14 from saprmarks/p_annealing
added annealing and trainer_param_callback (61927bc)
-
cosmetic changes to interp (
4a7966f) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c76818e) -
Merge pull request #13 from jannik-brinkmann/collab
add ListaTrainer (d4d2fd9)
-
additional evluation metrics (
fa2ec08) -
add GroupSAETrainer (
60e6068) -
added annealing and trainer_param_callback (
18e3fca) -
Merge remote-tracking branch 'upstream/collab' into collab (
4650c2a) -
fixing neuron resampling (
a346be9) -
improvements to saving and logging (
4a1d7ae) -
can export buffer config (
d19d8d9) -
fixing evaluation.py (
c91a581) -
fixing bug in neuron resampling (
67a03c7) -
add ListaTrainer (
880f570) -
fixing neuron resampling in standard trainer (
3406262) -
improvements to training and evaluating (
b111d40) -
Factoring out SAETrainer class (
fabd001) -
updating syntax for buffer (
035a0f9) -
updating readme for from_pretrained (
70e8c2a) -
from_pretrained (
db96abc) -
Change syntax for specifying activation dimensions and batch sizes (
bdf1f19) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
86c7475) -
activation_dim for IdentityDict is optional (
be1b68c) -
update umap requirement (
776b53e) -
Merge pull request #10 from adamkarvonen/shell_script_change
Add sae_set_name to local_path for dictionary downloader (33b5a6b)
-
Add sae_set_name to local_path for dictionary downloader (
d6163be) -
dispatch no longer needed when loading models (
69c32ca) -
removed in_and_out option for activation buffer (
cf6ad1d) -
updating readme with 10_32768 dictionaries (
614883f) -
upgrade to nnsight 0.2 (
cbc5f79) -
downloader script (
7a305c5) -
fixing device issue in buffer (
b1b44f1) -
added pretrained_dictionary_downloader.sh (
0028ebe) -
added pretrained_dictionary_downloader.sh (
8b63d8d) -
added pretrained_dictionary_downloader.sh (
6771aff) -
efficiency improvements (
94844d4) -
adding identity dict (
76bd32f) -
debugging interp (
2f75db3) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
86812f5) -
warns user when evaluating without enough data (
246c472) -
cleaning up interp (
95d7310) -
examine_dimension returns mbottom_tokens and logit stats (
40137ff) -
continuing merge (
db693a6) -
progress on merge (
949b3a7) -
changes to buffer.py (
792546b) -
fixing some things in buffer.py (
f58688e) -
updating requirements (
a54b496) -
updating requirements (
a1db591) -
identity dictionary (
5e1f35e) -
bug fix for neuron resampling (
b281b53) -
UMAP visualizations (
81f8e1f) -
better normalization for ghost_loss (
fc74af7) -
neuron resampling without replacement (
4565e9a) -
simplifications to interp functions (
2318666) -
Second nnsight 0.2 pass through (
3bcebed) -
Conversion to nnsight 0.2 first pass (
cac410a) -
detaching another thing in ghost grads (
2f212d6) -
Neuron resampling no longer errors when resampling zero neurons (
376dd3b) -
NNsight v0.2 Updates (
90bbc76) -
cosmetic improvements to buffer.py (
b2bd5f0) -
fix to ghost grads (
9531fe5) -
fixing table formatting (
0e69c8c) -
Fixing some table formatting (
75f927f) -
gpt2-small support (
f82146c) -
fixing bug relevant to UnifiedTransformer support (
9ec9ce4) -
Getting rid of histograms (
31d09d7) -
Fixing tables in readme (
5934011) -
Updates to the readme (
a5ca51e) -
Fixing ghost grad bugs (
633d583) -
Handling ghost grad case with no dead neurons (
4f19425) -
adding support for buffer on other devices (
f3cf296) -
support for ghost grads (
25d2a62) -
add an implementation of ghost gradients (
2e09210) -
fixing a bug with warmup, adding utils (
47bbde1) -
remove HF arg from buffer. rename search_utils to interp (
7276f17) -
typo fix (
3f6b922) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
278084b) -
added utils for converting hf dataset to generator (
82fff19) -
add ablated token effects to ; restore support for HF datasets (
799e2ca) -
merge in function for examining features (
986bf96) -
easier submodule/dictionary feature examination (
2c8b985) -
Adding lr warmup after every time neurons are resampled (
429c582) -
fixing issues with EmptyStream exception (
39ff6e1) -
Minor changes due to updates in nnsight (
49bbbac) -
Revert "restore support for streaming HF datasets"
This reverts commit b43527b. (23ada98)
-
restore support for streaming HF datasets (
b43527b) -
first version of automatic feature labeling (
c6753f6) -
Add feature_effect function to search_utils.py (
0ada2c6) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
fab70b1) -
adding sqrt to MSE (
63b2174) -
Merge pull request #1 from cadentj/main
Update README.md (fd79bb3)
-
Update README.md (
cf5ec24) -
Update README.md (
55f33f2) -
evaluation.py (
2edf59e) -
evaluating dictionaries (
71e28fb) -
Removing experimental use of sqrt on MSELoss (
865bbb5) -
Adding readme, evaluation, cleaning up (
ddac948) -
some stuff for saving dicts (
d1f0e21) -
removing device from buffer (
398f15c) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
7f013c2) -
lr schedule + enabling stretched mlp (
4eaf7e3) -
add random feature search (
e58cc67) -
restore HF support and progress bar (
7e2b6c6) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
d33ef05) -
more support for saving checkpints (
0ca258a) -
fix unit column bug + add scheduler (
5a05c8c) -
fix merge bugs: checkpointing support (
9c5bbd8) -
Merge: add HF datasets and checkpointing (
ccf6ed1) -
checkpointing, progress bar, HF dataset support (
fd8a3ee) -
progress bar for training autoencoders (
0a8064d) -
implementing neuron resampling (
f9b9d02) -
lotsa stuff (
bc09ba4) -
adding init.py file for imports (
3d9fd43) -
modifying buffer (
ba9441b) -
first commit (
ea89e90) -
Initial commit (
741f4d6)