Preflight check fails for thinking models (e.g. Nemotron) due to empty content field

## Problem

The preflight LLM check in `.github/run-eval/resolve_model_config.py` fails for thinking models like NVIDIA Nemotron-3 Super 120B.

When `enable_thinking: true` is set in the model config, the model puts all its output into `reasoning_content` rather than `content`. The current preflight check only validates `content`, so it always sees an empty response and aborts the evaluation.

```
✗ NVIDIA Nemotron-3 Super 120B: Empty response (finish_reason=length, usage=Usage(completion_tokens=100, ...))
```

## Fix

Also check `reasoning_content` alongside `content`:

```python
response_content = response.choices[0].message.content if response.choices else None
reasoning_content = response.choices[0].message.reasoning_content if response.choices else None

if response_content or reasoning_content:
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preflight check fails for thinking models (e.g. Nemotron) due to empty content field #2419

Problem

Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Preflight check fails for thinking models (e.g. Nemotron) due to empty content field #2419

Description

Problem

Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions