GPT-1-From-Scratch

Implementation of 56M GPT-1 from scratch. Since BookCorpus dataset (~1B tokens) is no longer publicly available, I instead use WikiText-103 (103M tokens) dataset to pre-train GPT-1

GPT-1 parameter values

Model dimension (d_model) = 512

Number of Attention Heads (n_heads) = 8

Number of Decoders (num_decoder_layers) = 8

Maximum sequence length (max_len) = 128

Feedforward layer hidden size (dim_feedforward) = 2048

Vocabulary size (vocab_size) = 30000 for WikiText-103 dataset

Batch size (batch_size) = 64

TOTAL PARAMETER COUNT ≈ 56M

Steps to run

Run input_processing.py to generate tokenized Wikitext data and save it in .pt torch tensor format
Run main_pretrain.py to pre-train GPT-1. The user can change training settings from this file and model parameters from GPT_Decoder.py
Run last cell in test.ipynb to generate random text from pre-trained GPT-1

Sample generation output: The earliest known mention of this date was that of 544 , when King Olaf II of Norway was discovered in the reign of King Olaf II of Norway . The earliest recorded mention of this date was from 544 , when King Olaf was assassinated . The date of the birth is unknown , but it is unclear whether Olaf was killed . Olaf 's birth date is unknown , but it is likely that Olaf was killed by the Vikings in 842 , but Olaf 's reign is uncertain . .

This can be improved by increasing model size, but is good enough for 56M parameter model.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
GPT_Decoder.py		GPT_Decoder.py
README.md		README.md
custom_dataloader.py		custom_dataloader.py
input_processing.py		input_processing.py
main_finetune.py		main_finetune.py
main_pretrain.py		main_pretrain.py
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPT-1-From-Scratch

GPT-1 parameter values

Steps to run

About

Uh oh!

Releases

Packages

Languages

KoustubhPhalak/GPT-1-From-Scratch

Folders and files

Latest commit

History

Repository files navigation

GPT-1-From-Scratch

GPT-1 parameter values

Steps to run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages