guided decoding parameters for tensorrt_llm backend must be present even if not needed

### System Info

Nvidia rtx 3090 ti
nvcr.io/nvidia/tritonserver:25.05-trtllm-python-py3

### Who can help?

@ncomly-nvidia 

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction

Steps to reproduce the behaviour:

1. take any tensorrt_llm compiled plan
2. delete from config.pbtxt parameters `tokenizer_dir`, `xgrammar_tokenizer_info_path` or `guided_decoding_backend`

### Expected behavior

`tensorrtllm_backend` should start normally

### actual behavior

`tensorrtllm_backend` crashes with a message that a parameter is missing, although it's not used.

### additional notes

https://github.com/NVIDIA/TensorRT-LLM/blob/7b210ae9c3fbc43501a529546d7c21d1f77c52ce/triton_backend/inflight_batcher_llm/src/model_instance_state.cc#L410-L453

	std::optional<executor::GuidedDecodingConfig> ModelInstanceState::getGuidedDecodingConfigFromParams()
	{
	std::optional<executor::GuidedDecodingConfig> guidedDecodingConfig = std::nullopt;
	std::string tokenizerDir = model_state_->GetParameter<std::string>("tokenizer_dir");
	std::string tokenizerInfoPath = model_state_->GetParameter<std::string>("xgrammar_tokenizer_info_path");
	std::string guidedDecodingBackendStr = model_state_->GetParameter<std::string>("guided_decoding_backend");

	if (!tokenizerDir.empty() && tokenizerDir != "${tokenizer_dir}")
	{
	TLLM_LOG_INFO(
	"Guided decoding C++ workflow does not use tokenizer_dir, this parameter will "
	"be ignored.");
	}

	if (guidedDecodingBackendStr.empty() \|\| guidedDecodingBackendStr == "${guided_decoding_backend}"
	\|\| tokenizerInfoPath.empty() \|\| tokenizerInfoPath == "${xgrammar_tokenizer_info_path}")
	{
	return guidedDecodingConfig;
	}

	TLLM_CHECK_WITH_INFO(std::filesystem::exists(tokenizerInfoPath),
	"Xgrammar's tokenizer info path at %s does not exist.", tokenizerInfoPath.c_str());

	auto const tokenizerInfo = nlohmann::json::parse(std::ifstream{std::filesystem::path(tokenizerInfoPath)});
	auto const encodedVocab = tokenizerInfo["encoded_vocab"].template get<std::vector<std::string>>();
	auto const tokenizerStr = tokenizerInfo["tokenizer_str"].template get<std::string>();
	auto const stopTokenIds
	= tokenizerInfo["stop_token_ids"].template get<std::vector<tensorrt_llm::runtime::TokenIdType>>();

	executor::GuidedDecodingConfig::GuidedDecodingBackend guidedDecodingBackend;
	if (guidedDecodingBackendStr == "xgrammar")
	{
	guidedDecodingBackend = executor::GuidedDecodingConfig::GuidedDecodingBackend::kXGRAMMAR;
	}
	else
	{
	TLLM_THROW(
	"Guided decoding is currently supported with 'xgrammar' backend. Invalid guided_decoding_backend parameter "
	"provided.");
	}
	guidedDecodingConfig
	= executor::GuidedDecodingConfig(guidedDecodingBackend, encodedVocab, tokenizerStr, stopTokenIds);
	return guidedDecodingConfig;
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

guided decoding parameters for tensorrt_llm backend must be present even if not needed #5099

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

guided decoding parameters for tensorrt_llm backend must be present even if not needed #5099

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions