Is it on the roadmap for this tool to support inference optimization such as speculative decoding for gpus and neuron cores?