Skip to content

Commit e9b891f

Browse files
committed
address refactoring feedback
Signed-off-by: savitha-eng <[email protected]>
1 parent 0abf0b2 commit e9b891f

File tree

16 files changed

+1100
-8
lines changed

16 files changed

+1100
-8
lines changed
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Example Small Llama3 Checkpoint
2+
3+
This directory contains the model and tokenizer configuration for a small Llama3 model (~10M parameters) optimized for genomic sequences. This checkpoint is designed for testing and development purposes, allowing unit tests to run without requiring external paths or complex configuration.
4+
5+
## Contents
6+
7+
- **config.json**: Model configuration for a small Llama3 model (4 layers, 2048 hidden size)
8+
- **tokenizer.json**: Fast tokenizer for nucleotide sequences (256 vocab size)
9+
- **tokenizer_config.json**: Tokenizer configuration
10+
- **special_tokens_map.json**: Special tokens mapping (EOS=0, PAD=1, BOS=2, UNK=3)
11+
12+
## Usage
13+
14+
Use this directory as the `model_tag` in your training configurations:
15+
16+
```yaml
17+
# In your hydra config
18+
model_tag: ./example_small_llama_checkpoint
19+
20+
dataset:
21+
tokenizer_path: ./example_small_llama_checkpoint # Same directory for tokenizer
22+
```
23+
24+
This eliminates the need for absolute paths and makes configurations portable across different environments.
25+
26+
## Model Parameters
27+
28+
- Layers: 4
29+
- Hidden size: 2048
30+
- Attention heads: 16
31+
- Intermediate size: 8192
32+
- Vocabulary size: 256 (nucleotide tokenizer)
33+
- Max position embeddings: 8192
34+
35+
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
{
2+
"attention_bias": false,
3+
"attention_dropout": 0.0,
4+
"bos_token_id": 2,
5+
"eos_token_id": 0,
6+
"head_dim": 128,
7+
"hidden_act": "silu",
8+
"hidden_size": 2048,
9+
"initializer_range": 0.02,
10+
"intermediate_size": 8192,
11+
"max_position_embeddings": 8192,
12+
"mlp_bias": false,
13+
"model_type": "llama",
14+
"num_attention_heads": 16,
15+
"num_hidden_layers": 4,
16+
"num_key_value_heads": 16,
17+
"pad_token_id": 1,
18+
"pretraining_tp": 1,
19+
"rms_norm_eps": 1e-05,
20+
"rope_scaling": null,
21+
"rope_theta": 500000.0,
22+
"tie_word_embeddings": false,
23+
"transformers_version": "4.57.1",
24+
"use_cache": true,
25+
"vocab_size": 256
26+
}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"bos_token": "<BOS>",
3+
"eos_token": "<EOS>",
4+
"pad_token": "<PAD>",
5+
"unk_token": "<UNK>"
6+
}

0 commit comments

Comments
 (0)