Skip to content

Conversation

@jerrychenhf
Copy link

@jerrychenhf jerrychenhf commented Nov 18, 2025

Implement warmup for P/D chunked prefill when prefill chunk size is specified. This is a simpler warmup case for chunked prefill. When chunk size is not specified, the number of warmup cases will be increased by times of(max_seq_len / block_size) which is not realistic to implement.

This implement using the max_model_len and the chunk size to calculate the possible number of context chunks and warmup with this context chunks.

@jerrychenhf jerrychenhf merged commit 46ad52b into HabanaAI:deepseek_r1 Nov 20, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants