I'd like to benchmark the optimized performance of LLAMA2 model with BigDL acceleration on SPR machine. I followed the README in [python/llm/example/CPU/Native-Models](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), which executed normally and printed the timing message. However, in the timing message, the prompt eval time (which is also the first token latency) is abnormal, as shown below. ``` bigdl-llm timings: prompt eval time = 0.00 ms / 1 tokens ( 0.00 ms per token) ``` The prompt eval time is zero, and the number of tokens didn't include the prompt tokens, it's different from ggml llama.cpp output message.