[Bug]: Llama-4-Maverick-17B-128E-Instruct quantization skips all MoE experts → missing expert weights → vLLM load failure

### ⚙️ Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
### Environment Information ###
Operating System: `Linux-6.8.0-85-generic-x86_64-with-glibc2.39`
Python Version: `3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0]`
llm-compressor Version: `0.8.2.dev54+g6fea8880.d20251119`
compressed-tensors Version: `0.12.3a20251114`
transformers Version: `4.57.1`
torch Version: `2.9.0`
CUDA Devices: `['NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200', 'NVIDIA B200']`
AMD Devices: `None`
```

</details>


### 🐛 Describe the bug

When running `examples/quantization_w4a4_fp4/llama4_example.py` to quantize the `Llama-4-Maverick-17B-128E-Instruct` model, the generated `config.json `places all MoE experts under the `ignore` list - none of the routed experts are quantized. These should get quantized. 

The shared expert is quantized correctly, but all 128 routed experts remain unquantized, producing **no** quantized tensors such as `w1_weight`, `w2_weight`, `w3_weight`.

This leads to vLLM failing to load the model with:
```
 KeyError: 'layers.17.feed_forward.experts.126.w1_weight'
```
because the expected quantized expert parameters were never created.

**Impact:**
MoE expert weights are silently omitted during quantization, producing incomplete checkpoints incompatible with vLLM inference.

### 🛠️ Steps to reproduce

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Llama-4-Maverick-17B-128E-Instruct quantization skips all MoE experts → missing expert weights → vLLM load failure #2060

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Llama-4-Maverick-17B-128E-Instruct quantization skips all MoE experts → missing expert weights → vLLM load failure #2060

Description

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions