Skip to content

[AutoDeploy][Feature] Eagle Speculative Decoding #9241

@govind-ramnarayan

Description

@govind-ramnarayan

🚀 The feature, motivation and pitch

Support Eagle (specifically, Eagle-1) speculative decoding in AutoDeploy.

This will continue the work started in #9147 , to extend the range of speculative decoding setups supported by AutoDeploy. Just like #9147, we are working in the two-model regime, where we have separate "target" and "draft" models, but this time the draft model will be an Eagle-style module, which reads hidden states from the target as part of its input.

Goal: Infrastructure changes made for this should carry over to supporting MTPEagle speculative decoding in AutoDeploy. Needs some additional verification of MTPEagle-related code in TRT-LLM.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

AutoDeploy<NV> AutoDeploy BackendSpeculative Decoding<NV>MTP/Eagle/Medusa/Lookahead/Prompt-Lookup-Decoding/Draft-Target-Model/ReDrafterfeature requestNew feature or request. This includes new model, dtype, functionality support

Type

No type

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions