Help wanted: add before-vs-after proof entries for kernel skills

## What this is

The `proof/` directory holds empirical evidence that skill files produce measurably better kernel code — same model, same prompt, with and without the skill file injected into context.

There is currently one entry:
- [`proof/cuda/softmax/`](proof/cuda/softmax/softmax-correctness.md) — validates `write-cuda-softmax-kernel` (RTX 4070, Claude Sonnet 4.6, 2 bug classes caught)

**Every other skill in this repo has no proof entry yet.**

---

## What we are looking for

Run a before-vs-after benchmark for any skill in `skills/`. The bar is intentionally low — a screenshot, a correctness table, or a chart is enough.

### Skills that most need proof entries

| Skill | Category | Priority |
|---|---|---|
| `write-cuda-reduction-kernel` | cuda | high |
| `write-cuda-gemm-kernel` | cuda | high |
| `write-cuda-layernorm-kernel` | cuda | high |
| `write-triton-softmax-kernel` | triton | high |
| `write-triton-attention-kernel` | triton | high |
| `write-int8-quantized-kernel` | quantization | high |
| `write-fp8-kernel` | quantization | high |
| `avoid-warp-divergence` | cuda | medium |
| `write-numerically-stable-kernel` | patterns | medium |
| `handle-boundary-conditions` | patterns | medium |
| `port-cuda-kernel-to-triton` | portability | medium |

---

## How to contribute a proof

1. Pick a skill from the table above (comment below to claim it so no one duplicates effort).
2. Generate a kernel **without** the skill file using any capable coding model.
3. Generate the same kernel **with** the skill file injected into context. Same model, same base prompt.
4. Run both. Compare correctness and/or performance.
5. Create `proof/<category>/<kernel-name>/` and drop in your artifacts.
6. Open a PR.

Full instructions: [proof/README.md](proof/README.md)

---

## Minimum bar

- Same model, same base prompt — only the skill file differs between the two runs.
- At least one correctness check (not just speed numbers).
- Hardware model + shapes tested noted somewhere.

A chart is nice but not required. Raw numbers in a table are fine. A screenshot works.

---

## What a strong entry looks like

See the existing softmax proof as a reference:
- [proof/cuda/softmax/softmax-correctness.md](proof/cuda/softmax/softmax-correctness.md)
- Includes: pass/fail matrix, error cliff chart, adversarial failure count, root-cause explanation, animated diff

You do not need to match that level of polish for a first entry. Correctness and reproducibility matter more than visual quality.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help wanted: add before-vs-after proof entries for kernel skills #1

What this is

What we are looking for

Skills that most need proof entries

How to contribute a proof

Minimum bar

What a strong entry looks like

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Skill	Category	Priority
`write-cuda-reduction-kernel`	cuda	high
`write-cuda-gemm-kernel`	cuda	high
`write-cuda-layernorm-kernel`	cuda	high
`write-triton-softmax-kernel`	triton	high
`write-triton-attention-kernel`	triton	high
`write-int8-quantized-kernel`	quantization	high
`write-fp8-kernel`	quantization	high
`avoid-warp-divergence`	cuda	medium
`write-numerically-stable-kernel`	patterns	medium
`handle-boundary-conditions`	patterns	medium
`port-cuda-kernel-to-triton`	portability	medium

Help wanted: add before-vs-after proof entries for kernel skills #1

Description

What this is

What we are looking for

Skills that most need proof entries

How to contribute a proof

Minimum bar

What a strong entry looks like

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions