From 7f208247d8541bfe66efdf381414eb731d360d7a Mon Sep 17 00:00:00 2001 From: Vaibhav Jindal Date: Thu, 14 May 2026 15:37:15 -0700 Subject: [PATCH] Make skills vendor-agnostic via .agents/skills + symlink Move .claude/skills to .agents/skills as the canonical (vendor-neutral) location, with .claude/skills now a symlink so Claude Code still auto-discovers them. This mirrors the pattern NVIDIA TileGym uses. Skill content updates: - SKILL.md stage intros rephrased from "Spawn an X agent" to "Follow the X workflow in ", which describes the workflow without mandating any single runtime's subagent mechanism. A note explains that runtimes with parallel subagent support may still delegate. - finalizer.md commit/PR templates no longer hardcode a Claude-specific Co-Authored-By trailer or "Generated with Claude Code" footer; runtime attribution is left to the host runtime's own convention. Adds a root-level AGENTS.md (a convention picked up by Codex, Cursor, Aider, Gemini CLI, etc.) pointing readers to .agents/skills/ and summarising the three skills. --- .../skills/liger-autopatch/SKILL.md | 12 +++---- .../skills/liger-autopatch/code-generator.md | 0 .../skills/liger-autopatch/decision-matrix.md | 0 .../liger-autopatch/examples/gemma-profile.md | 0 .../liger-autopatch/examples/llama-profile.md | 0 .../skills/liger-autopatch/model-analyzer.md | 0 .../templates/lce-forward-dense.md | 0 .../templates/lce-forward-moe.md | 0 .../templates/monkey-patch-fn.md | 0 .../templates/test-convergence.md | 0 .../templates/test-instance-patch.md | 0 .../skills/liger-autopatch/validator.md | 0 .../skills/liger-kernel-dev/SKILL.md | 6 ++-- .../skills/liger-kernel-dev/analyzer.md | 0 .../examples/cross-entropy-profile.md | 0 .../examples/rms-norm-profile.md | 0 .../examples/swiglu-profile.md | 0 .../skills/liger-kernel-dev/generator.md | 0 .../liger-kernel-dev/kernel-profile-format.md | 0 .../liger-kernel-dev/templates/benchmark.md | 0 .../templates/functional-api.md | 0 .../templates/module-wrapper.md | 0 .../liger-kernel-dev/templates/ops-kernel.md | 0 .../liger-kernel-dev/templates/unit-test.md | 0 .../skills/liger-kernel-dev/validator.md | 0 .../skills/liger-kernel-perf/SKILL.md | 12 +++---- .../skills/liger-kernel-perf/finalizer.md | 8 ++--- .../optimization-strategies.md | 0 .../skills/liger-kernel-perf/optimizer.md | 0 .../skills/liger-kernel-perf/profiler.md | 0 .../templates/optimization-profile.md | 0 .../templates/variant-notes.md | 0 .claude/skills | 1 + AGENTS.md | 33 +++++++++++++++++++ 34 files changed, 53 insertions(+), 19 deletions(-) rename {.claude => .agents}/skills/liger-autopatch/SKILL.md (86%) rename {.claude => .agents}/skills/liger-autopatch/code-generator.md (100%) rename {.claude => .agents}/skills/liger-autopatch/decision-matrix.md (100%) rename {.claude => .agents}/skills/liger-autopatch/examples/gemma-profile.md (100%) rename {.claude => .agents}/skills/liger-autopatch/examples/llama-profile.md (100%) rename {.claude => .agents}/skills/liger-autopatch/model-analyzer.md (100%) rename {.claude => .agents}/skills/liger-autopatch/templates/lce-forward-dense.md (100%) rename {.claude => .agents}/skills/liger-autopatch/templates/lce-forward-moe.md (100%) rename {.claude => .agents}/skills/liger-autopatch/templates/monkey-patch-fn.md (100%) rename {.claude => .agents}/skills/liger-autopatch/templates/test-convergence.md (100%) rename {.claude => .agents}/skills/liger-autopatch/templates/test-instance-patch.md (100%) rename {.claude => .agents}/skills/liger-autopatch/validator.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/SKILL.md (89%) rename {.claude => .agents}/skills/liger-kernel-dev/analyzer.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/examples/cross-entropy-profile.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/examples/rms-norm-profile.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/examples/swiglu-profile.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/generator.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/kernel-profile-format.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/templates/benchmark.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/templates/functional-api.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/templates/module-wrapper.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/templates/ops-kernel.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/templates/unit-test.md (100%) rename {.claude => .agents}/skills/liger-kernel-dev/validator.md (100%) rename {.claude => .agents}/skills/liger-kernel-perf/SKILL.md (93%) rename {.claude => .agents}/skills/liger-kernel-perf/finalizer.md (98%) rename {.claude => .agents}/skills/liger-kernel-perf/optimization-strategies.md (100%) rename {.claude => .agents}/skills/liger-kernel-perf/optimizer.md (100%) rename {.claude => .agents}/skills/liger-kernel-perf/profiler.md (100%) rename {.claude => .agents}/skills/liger-kernel-perf/templates/optimization-profile.md (100%) rename {.claude => .agents}/skills/liger-kernel-perf/templates/variant-notes.md (100%) create mode 120000 .claude/skills create mode 100644 AGENTS.md diff --git a/.claude/skills/liger-autopatch/SKILL.md b/.agents/skills/liger-autopatch/SKILL.md similarity index 86% rename from .claude/skills/liger-autopatch/SKILL.md rename to .agents/skills/liger-autopatch/SKILL.md index d686fd755..f25aca596 100644 --- a/.claude/skills/liger-autopatch/SKILL.md +++ b/.agents/skills/liger-autopatch/SKILL.md @@ -18,15 +18,15 @@ Keywords that suggest modify mode: update, fix, change, add [kernel] to [existin ### Stage 1: Analyze -Spawn a **Model Analyzer** agent (read [model-analyzer.md](model-analyzer.md)). +Follow the **Model Analyzer** workflow in [model-analyzer.md](model-analyzer.md). If the host runtime supports parallel subagents, this stage may be delegated to one; otherwise execute the workflow directly. -The agent reads the HF `modeling_*.py` source and produces a **model profile** answering 12 architectural questions from [decision-matrix.md](decision-matrix.md). +This stage reads the HF `modeling_*.py` source and produces a **model profile** answering 12 architectural questions from [decision-matrix.md](decision-matrix.md). **Human checkpoint:** Present the profile. Confirm before proceeding. ### Stage 2: Generate -Spawn a **Code Generator** agent (read [code-generator.md](code-generator.md)). +Follow the **Code Generator** workflow in [code-generator.md](code-generator.md). Generates/modifies up to 13 files: @@ -48,7 +48,7 @@ Generates/modifies up to 13 files: ### Stage 3: Validate -Spawn a **Validator** agent (read [validator.md](validator.md)). +Follow the **Validator** workflow in [validator.md](validator.md). Runs instance patching test, convergence test, and lint check. Retries up to 3 times on failure. @@ -69,13 +69,13 @@ Read the existing `apply_liger_kernel_to_{model_type}` function in `monkey_patch ### Stage 2: Apply Changes -Spawn the **Code Generator** agent (read [code-generator.md](code-generator.md)) in **modify mode**. +Follow the **Code Generator** workflow in [code-generator.md](code-generator.md) in **modify mode**. **Human checkpoint:** Present changes for review. ### Stage 3: Validate -Spawn the **Validator** agent (read [validator.md](validator.md)). This stage is **mandatory** — do not skip it. At minimum, run: +Follow the **Validator** workflow in [validator.md](validator.md). This stage is **mandatory** — do not skip it. At minimum, run: 1. Instance patching test: `pytest test/transformers/test_monkey_patch.py -k "{model_type}" -xvs` 2. All convergence tests for the model: diff --git a/.claude/skills/liger-autopatch/code-generator.md b/.agents/skills/liger-autopatch/code-generator.md similarity index 100% rename from .claude/skills/liger-autopatch/code-generator.md rename to .agents/skills/liger-autopatch/code-generator.md diff --git a/.claude/skills/liger-autopatch/decision-matrix.md b/.agents/skills/liger-autopatch/decision-matrix.md similarity index 100% rename from .claude/skills/liger-autopatch/decision-matrix.md rename to .agents/skills/liger-autopatch/decision-matrix.md diff --git a/.claude/skills/liger-autopatch/examples/gemma-profile.md b/.agents/skills/liger-autopatch/examples/gemma-profile.md similarity index 100% rename from .claude/skills/liger-autopatch/examples/gemma-profile.md rename to .agents/skills/liger-autopatch/examples/gemma-profile.md diff --git a/.claude/skills/liger-autopatch/examples/llama-profile.md b/.agents/skills/liger-autopatch/examples/llama-profile.md similarity index 100% rename from .claude/skills/liger-autopatch/examples/llama-profile.md rename to .agents/skills/liger-autopatch/examples/llama-profile.md diff --git a/.claude/skills/liger-autopatch/model-analyzer.md b/.agents/skills/liger-autopatch/model-analyzer.md similarity index 100% rename from .claude/skills/liger-autopatch/model-analyzer.md rename to .agents/skills/liger-autopatch/model-analyzer.md diff --git a/.claude/skills/liger-autopatch/templates/lce-forward-dense.md b/.agents/skills/liger-autopatch/templates/lce-forward-dense.md similarity index 100% rename from .claude/skills/liger-autopatch/templates/lce-forward-dense.md rename to .agents/skills/liger-autopatch/templates/lce-forward-dense.md diff --git a/.claude/skills/liger-autopatch/templates/lce-forward-moe.md b/.agents/skills/liger-autopatch/templates/lce-forward-moe.md similarity index 100% rename from .claude/skills/liger-autopatch/templates/lce-forward-moe.md rename to .agents/skills/liger-autopatch/templates/lce-forward-moe.md diff --git a/.claude/skills/liger-autopatch/templates/monkey-patch-fn.md b/.agents/skills/liger-autopatch/templates/monkey-patch-fn.md similarity index 100% rename from .claude/skills/liger-autopatch/templates/monkey-patch-fn.md rename to .agents/skills/liger-autopatch/templates/monkey-patch-fn.md diff --git a/.claude/skills/liger-autopatch/templates/test-convergence.md b/.agents/skills/liger-autopatch/templates/test-convergence.md similarity index 100% rename from .claude/skills/liger-autopatch/templates/test-convergence.md rename to .agents/skills/liger-autopatch/templates/test-convergence.md diff --git a/.claude/skills/liger-autopatch/templates/test-instance-patch.md b/.agents/skills/liger-autopatch/templates/test-instance-patch.md similarity index 100% rename from .claude/skills/liger-autopatch/templates/test-instance-patch.md rename to .agents/skills/liger-autopatch/templates/test-instance-patch.md diff --git a/.claude/skills/liger-autopatch/validator.md b/.agents/skills/liger-autopatch/validator.md similarity index 100% rename from .claude/skills/liger-autopatch/validator.md rename to .agents/skills/liger-autopatch/validator.md diff --git a/.claude/skills/liger-kernel-dev/SKILL.md b/.agents/skills/liger-kernel-dev/SKILL.md similarity index 89% rename from .claude/skills/liger-kernel-dev/SKILL.md rename to .agents/skills/liger-kernel-dev/SKILL.md index 37951cb61..3958d0ba6 100644 --- a/.claude/skills/liger-kernel-dev/SKILL.md +++ b/.agents/skills/liger-kernel-dev/SKILL.md @@ -16,7 +16,7 @@ Develops Triton kernels for Liger Kernel through a 3-stage pipeline with human r ### Stage 1: Analyze -Spawn an **Analyzer** agent (read [analyzer.md](analyzer.md)). +Follow the **Analyzer** workflow in [analyzer.md](analyzer.md). If the host runtime supports parallel subagents, this stage may be delegated to one; otherwise execute the workflow directly. Accepts any input: local file, URL, code snippet, natural language description, or model component reference. Produces a standalone PyTorch reference implementation and a kernel profile. @@ -24,7 +24,7 @@ Accepts any input: local file, URL, code snippet, natural language description, ### Stage 2: Generate -Spawn a **Generator** agent (read [generator.md](generator.md)). +Follow the **Generator** workflow in [generator.md](generator.md). Generates/modifies up to 8 files: @@ -41,7 +41,7 @@ Generates/modifies up to 8 files: ### Stage 3: Validate -Spawn a **Validator** agent (read [validator.md](validator.md)). +Follow the **Validator** workflow in [validator.md](validator.md). Runs checkstyle, unit tests (hard gate — stops on persistent failure), benchmarks, and generates plots. Optionally runs ncu profiling. diff --git a/.claude/skills/liger-kernel-dev/analyzer.md b/.agents/skills/liger-kernel-dev/analyzer.md similarity index 100% rename from .claude/skills/liger-kernel-dev/analyzer.md rename to .agents/skills/liger-kernel-dev/analyzer.md diff --git a/.claude/skills/liger-kernel-dev/examples/cross-entropy-profile.md b/.agents/skills/liger-kernel-dev/examples/cross-entropy-profile.md similarity index 100% rename from .claude/skills/liger-kernel-dev/examples/cross-entropy-profile.md rename to .agents/skills/liger-kernel-dev/examples/cross-entropy-profile.md diff --git a/.claude/skills/liger-kernel-dev/examples/rms-norm-profile.md b/.agents/skills/liger-kernel-dev/examples/rms-norm-profile.md similarity index 100% rename from .claude/skills/liger-kernel-dev/examples/rms-norm-profile.md rename to .agents/skills/liger-kernel-dev/examples/rms-norm-profile.md diff --git a/.claude/skills/liger-kernel-dev/examples/swiglu-profile.md b/.agents/skills/liger-kernel-dev/examples/swiglu-profile.md similarity index 100% rename from .claude/skills/liger-kernel-dev/examples/swiglu-profile.md rename to .agents/skills/liger-kernel-dev/examples/swiglu-profile.md diff --git a/.claude/skills/liger-kernel-dev/generator.md b/.agents/skills/liger-kernel-dev/generator.md similarity index 100% rename from .claude/skills/liger-kernel-dev/generator.md rename to .agents/skills/liger-kernel-dev/generator.md diff --git a/.claude/skills/liger-kernel-dev/kernel-profile-format.md b/.agents/skills/liger-kernel-dev/kernel-profile-format.md similarity index 100% rename from .claude/skills/liger-kernel-dev/kernel-profile-format.md rename to .agents/skills/liger-kernel-dev/kernel-profile-format.md diff --git a/.claude/skills/liger-kernel-dev/templates/benchmark.md b/.agents/skills/liger-kernel-dev/templates/benchmark.md similarity index 100% rename from .claude/skills/liger-kernel-dev/templates/benchmark.md rename to .agents/skills/liger-kernel-dev/templates/benchmark.md diff --git a/.claude/skills/liger-kernel-dev/templates/functional-api.md b/.agents/skills/liger-kernel-dev/templates/functional-api.md similarity index 100% rename from .claude/skills/liger-kernel-dev/templates/functional-api.md rename to .agents/skills/liger-kernel-dev/templates/functional-api.md diff --git a/.claude/skills/liger-kernel-dev/templates/module-wrapper.md b/.agents/skills/liger-kernel-dev/templates/module-wrapper.md similarity index 100% rename from .claude/skills/liger-kernel-dev/templates/module-wrapper.md rename to .agents/skills/liger-kernel-dev/templates/module-wrapper.md diff --git a/.claude/skills/liger-kernel-dev/templates/ops-kernel.md b/.agents/skills/liger-kernel-dev/templates/ops-kernel.md similarity index 100% rename from .claude/skills/liger-kernel-dev/templates/ops-kernel.md rename to .agents/skills/liger-kernel-dev/templates/ops-kernel.md diff --git a/.claude/skills/liger-kernel-dev/templates/unit-test.md b/.agents/skills/liger-kernel-dev/templates/unit-test.md similarity index 100% rename from .claude/skills/liger-kernel-dev/templates/unit-test.md rename to .agents/skills/liger-kernel-dev/templates/unit-test.md diff --git a/.claude/skills/liger-kernel-dev/validator.md b/.agents/skills/liger-kernel-dev/validator.md similarity index 100% rename from .claude/skills/liger-kernel-dev/validator.md rename to .agents/skills/liger-kernel-dev/validator.md diff --git a/.claude/skills/liger-kernel-perf/SKILL.md b/.agents/skills/liger-kernel-perf/SKILL.md similarity index 93% rename from .claude/skills/liger-kernel-perf/SKILL.md rename to .agents/skills/liger-kernel-perf/SKILL.md index 05beadab7..afc48a152 100644 --- a/.claude/skills/liger-kernel-perf/SKILL.md +++ b/.agents/skills/liger-kernel-perf/SKILL.md @@ -42,9 +42,9 @@ If any validation fails, report clearly and stop. ### Stage 1: Profile -Spawn a **Profiler** agent (read [profiler.md](profiler.md)). +Follow the **Profiler** workflow in [profiler.md](profiler.md). If the host runtime supports parallel subagents, this stage may be delegated to one; otherwise execute the workflow directly. -The agent: +This stage: 1. Creates the workspace directory `optimization/{kernel}/` 2. Copies the original kernel as a snapshot 3. Runs baseline benchmarks using the existing benchmark script @@ -59,9 +59,9 @@ The agent: ### Stage 2: Optimize -Spawn an **Optimizer** agent (read [optimizer.md](optimizer.md)). +Follow the **Optimizer** workflow in [optimizer.md](optimizer.md). -The agent runs an autonomous optimization loop: +This stage runs an autonomous optimization loop: 1. Read the optimization profile and original kernel 2. **Always try parameter tuning first** (BLOCK_SIZE, num_warps, num_stages manual sweep -- NOT @triton.autotune) @@ -81,9 +81,9 @@ The agent runs an autonomous optimization loop: ### Stage 3: Finalize -Spawn a **Finalizer** agent (read [finalizer.md](finalizer.md)). +Follow the **Finalizer** workflow in [finalizer.md](finalizer.md). -The agent: +This stage: 1. Applies the winning variant in-place to `src/liger_kernel/ops/{kernel}.py` 2. Runs the full test suite: `python -m pytest test/transformers/test_{kernel}.py -xvs` (hard gate) 3. Runs checkstyle: `make checkstyle` (auto-fix with `ruff check . --fix && ruff format .`) diff --git a/.claude/skills/liger-kernel-perf/finalizer.md b/.agents/skills/liger-kernel-perf/finalizer.md similarity index 98% rename from .claude/skills/liger-kernel-perf/finalizer.md rename to .agents/skills/liger-kernel-perf/finalizer.md index 98303fdf1..852876eb3 100644 --- a/.claude/skills/liger-kernel-perf/finalizer.md +++ b/.agents/skills/liger-kernel-perf/finalizer.md @@ -276,12 +276,12 @@ Benchmark results ({gpu_name}): - Speed (backward): {delta}% {faster/slower} - Speed (full): {delta}% {faster/slower} - Memory: {delta}% {reduction/increase} - -Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` +If the host AI runtime has its own commit-attribution convention (e.g., a `Co-Authored-By` trailer), append it per that runtime's guidelines. + #### Step 7c: Push and Create PR ```bash @@ -351,12 +351,12 @@ Tested on {gpu_name}. Values are median ms (speed) or MB (memory). - [x] All existing unit tests pass - [x] Benchmarks show improvement across all input sizes - [x] No regression on non-target metrics (speed/memory balance maintained) - -🤖 Generated with [Claude Code](https://claude.com/claude-code) EOF )" ``` +If the host AI runtime appends a "Generated with" footer (e.g., Claude Code, Cursor, Copilot), include it per that runtime's guidelines. + **Important**: Do NOT include plots as image attachments in the PR. Plots are for local review only and live in the optimization workspace. ### Step 8: Present the Before/After Summary diff --git a/.claude/skills/liger-kernel-perf/optimization-strategies.md b/.agents/skills/liger-kernel-perf/optimization-strategies.md similarity index 100% rename from .claude/skills/liger-kernel-perf/optimization-strategies.md rename to .agents/skills/liger-kernel-perf/optimization-strategies.md diff --git a/.claude/skills/liger-kernel-perf/optimizer.md b/.agents/skills/liger-kernel-perf/optimizer.md similarity index 100% rename from .claude/skills/liger-kernel-perf/optimizer.md rename to .agents/skills/liger-kernel-perf/optimizer.md diff --git a/.claude/skills/liger-kernel-perf/profiler.md b/.agents/skills/liger-kernel-perf/profiler.md similarity index 100% rename from .claude/skills/liger-kernel-perf/profiler.md rename to .agents/skills/liger-kernel-perf/profiler.md diff --git a/.claude/skills/liger-kernel-perf/templates/optimization-profile.md b/.agents/skills/liger-kernel-perf/templates/optimization-profile.md similarity index 100% rename from .claude/skills/liger-kernel-perf/templates/optimization-profile.md rename to .agents/skills/liger-kernel-perf/templates/optimization-profile.md diff --git a/.claude/skills/liger-kernel-perf/templates/variant-notes.md b/.agents/skills/liger-kernel-perf/templates/variant-notes.md similarity index 100% rename from .claude/skills/liger-kernel-perf/templates/variant-notes.md rename to .agents/skills/liger-kernel-perf/templates/variant-notes.md diff --git a/.claude/skills b/.claude/skills new file mode 120000 index 000000000..2b7a412b8 --- /dev/null +++ b/.claude/skills @@ -0,0 +1 @@ +../.agents/skills \ No newline at end of file diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..cdc72f6f4 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,33 @@ +# AGENTS.md + +Guidance for AI coding assistants working in this repository. + +## Skills + +Repository-specific workflow guides ("skills") live in [`.agents/skills/`](.agents/skills/). Each subdirectory is a self-contained guide for a multi-stage workflow. Read the `SKILL.md` at the top of each subdirectory for an overview; the other files in that directory are referenced from `SKILL.md` and should be read on demand. + +| Skill | What it does | +|-------|--------------| +| [`liger-kernel-dev`](.agents/skills/liger-kernel-dev/SKILL.md) | Develops new Triton kernels from a PyTorch reference (or modifies existing kernels). 3-stage pipeline: Analyze → Generate → Validate. NVIDIA GPUs only. | +| [`liger-autopatch`](.agents/skills/liger-autopatch/SKILL.md) | Adds Liger Kernel support for a new HuggingFace Transformers model, or modifies an existing monkey-patch. 3-stage pipeline: Analyze → Generate → Validate. | +| [`liger-kernel-perf`](.agents/skills/liger-kernel-perf/SKILL.md) | Optimizes the performance of an existing Liger Triton kernel. 3-stage pipeline: Profile → Optimize → Finalize. NVIDIA GPUs only. | + +The skills are written to be runtime-agnostic — they describe the workflow as a sequence of stages a competent agent (or human) can follow. Where a stage says "Follow the X workflow in `x.md`", that's a directive to read and execute that file's instructions; runtimes that support parallel subagents may delegate the stage, but it is not required. + +## Vendor-specific shortcuts + +For convenience, some assistants auto-discover skills from vendor-specific paths. These point at the canonical `.agents/skills/` directory: + +- `.claude/skills` → symlink → `.agents/skills` (for Claude Code) + +If you're adding support for another assistant, add a symlink (or your tool's preferred adapter) pointing to `.agents/skills/`. Do not duplicate the content. + +## Repo conventions + +- Source layout: `src/liger_kernel/{ops,transformers}/` for Triton ops and `nn.Module` / HF wrappers respectively +- Tests: `test/transformers/` (unit) and `test/convergence/{bf16,fp32}/` (model convergence) +- Benchmarks: `benchmark/scripts/` (scripts) and `benchmark/data/all_benchmark_data.csv` (results) +- Lint/format: `make checkstyle` (uses `ruff`) +- Install dev mode: `pip install -e ".[dev]"` + +See `README.md` for the project overview and contribution guide.