Replies: 3 comments 1 reply
-
|
Yes, both of them. You can train a smaller model by distillating a trained larger model, the smaller model can be streaming or non-streaming even if the larger model is a non-streaming one. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
thank you! i will give it a ry! |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Hello, What knowledge distillation code should i use to train a smaller model based on an already trained larger zipformer? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
Can i use the knowledge distillation ( as in distillation_with_hubert.sh ) to train a smaller model based on an already trained larger zipformer b2 model or this can only be used for a fairseq model ?
And if so, can i use a large non streaming model to distill to a smaller streaming zipformer v2 ?
Best regards,
Joachim
Beta Was this translation helpful? Give feedback.
All reactions