[GenAI 2025.3] Accuracy drop in chat mode for Llama-3.2-1B-Instruct (regression check for ID 167065, 168564, …)

---

### Environment

* OS: **Ubuntu 24.04**
* OpenVINO GenAI: **2025.3**
* CPU: **Intel Core Ultra 5 125H**

---

### Background

In the **2025.2 release notes**, the following known issue was documented:

> **Component:** OpenVINO GenAI
> **ID:** 167065, 168564, 168360, 168339, 168361
> **Description:**
> Models such as **Qwen-7B-Chat, Phi4-Reasoning, Llama-3.2-1B-Instruct, Qwen3-8B, and DeepSeek-R1-Distill*** show reduced accuracy in chat scenarios compared to regular generation requests. Currently no workaround is available; a fix is planned for future releases.

---

### Current Observation

With **OpenVINO GenAI 2025.3**, I tested **Llama-3.2-1B-Instruct** in chat scenarios, and still observed **noticeable accuracy degradation** compared to:

* The same model used in **regular generation**
* Running via **`optimum.intel.openvino`** (`OVModelForCausalLM`)

This suggests the issue might still persist.

---

### Question

* Has this issue already been fixed in **2025.3**?
* If not yet fixed, is there an **ETA** for when the fix will be released?
* Until then, is using **`optimum.intel.openvino`'s OVModelForCausalLM** the recommended workaround to avoid accuracy degradation?

---



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GenAI 2025.3] Accuracy drop in chat mode for Llama-3.2-1B-Instruct (regression check for ID 167065, 168564, …) #2791

Environment

Background

Current Observation

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[GenAI 2025.3] Accuracy drop in chat mode for Llama-3.2-1B-Instruct (regression check for ID 167065, 168564, …) #2791

Description

Environment

Background

Current Observation

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions