Skip to content

There was no interest from the maintainers, there is no interest in correcting technical requirements. #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,15 @@ The script loads the checkpoint and samples from the model on a test input.
Due to the large size of the model (314B parameters), a machine with enough GPU memory is required to test the model with the example code.
The implementation of the MoE layer in this repository is not efficient. The implementation was chosen to avoid the need for custom kernels to validate the correctness of the model.


observation: the NVIDIA CUDA dependencies are only for linux:

```
jax[cuda12-pip]==0.4.25 -f https://.../jax-releases/jax_cuda_releases.html
```

Please check which hardware is compatible with: [Link]([URL](https://docs.nvidia.com/deploy/cuda-compatibility/index.html))

# Model Specifications

Grok-1 is currently designed with the following specifications:
Expand Down