Replies: 1 comment
-
100 trillion parameters? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
With recent discussions of real time frankenmerges by running passes over layers multiple times, I was wondering what the limit would be in terms of size. Could it theoretically scale to 100T and beyond? If speed scales linearly with the number of layers, I think it would only take around 50s per token on a 3090. Is this how we achieve ASI?
Beta Was this translation helpful? Give feedback.
All reactions