sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE by al4225 · Pull Request #1322 · StatFunGen/xqtl-protocol

al4225 · 2026-05-12T20:17:44Z

Summary

Documentation + small interface additions to code/enrichment/sldsc_enrichment.ipynb (pairs with StatFunGen/pecotmr#489).

Documentation

[make_annotation_files_ldscore] — explain the snplist vs no-snplist split: --snp_list routes to ldsc.py --l2 --print-snps (output .l2.ldscore.gz), absent routes to compute_ldscores.py (output .l2.ldscore.parquet). Root cause of the L2_0 vs ANNOT_0 .results Category name: ldsc.py hard-codes the LD-score column to "L2" when there is exactly one annotation (dropping the annot's column name), while compute_ldscores.py always keeps it. Also note the snplist-only requirement that the target annot carry a CM column and line up row-for-row with the plink .bim.
[get_heritability] — explain the .results Category _0 / _1 suffix: each polyfun call gets exactly two --ref-ld-chr sources (target = index 0, baseline = index 1), so target annots come out as ANNOT_0 / L2_0 / ANNOT_1_0… and baseline as base_1, Coding_UCSC_1, … --target-categories (if passed explicitly) must match this exactly.
[postprocess] / section 3 markdown — document the new --target-categories-label.

Interface

[postprocess] — new --target-categories-label parameter (same order as --target-categories), passed through to pecotmr::sldsc_postprocessing_pipeline(target_labels=). Renames the target columns / tau*-block names in the output RDS to friendly names (e.g. ANNOT_1_0 ANNOT_2_0 → quantile_eQTL eQTL); originals kept in params$target_categories_orig. Omit to keep the polyfun .results names — unchanged default behaviour.
Dropped the dead parameter: Mref = -1 from [global] (no longer read by anything).

MWE fixes

Corrected output paths in the post-processing MWE cell (output/heritability, output/my_analysis_joint), changed the illustrative --target-categories anno1 anno2 to real .results Category names (ANNOT_1_0 ANNOT_2_0), added an allm note (--maf-cutoff 0).

Testing

Ran the last steps of the updated_pipeline_by_gao/test MWE (postprocess + meta_subset) against this notebook + the pecotmr PR branch, with and without --target-categories-label: relabel works (params$target_categories = quantile_eQTL, target_categories_orig = ANNOT_0, meta_subset inherits the label), default path unchanged, numeric values identical between the two.

🤖 Generated with Claude Code

…x MWE - [make_annotation_files_ldscore]: explain --snp_list -> ldsc.py --l2 --print-snps (output .l2.ldscore.gz) vs no --snp_list -> compute_ldscores.py (.l2.ldscore.parquet); document the root cause of "L2_0" vs "ANNOT_0" in .results (ldsc.py hard-codes the LD-score column name to "L2" when n_annot == 1, dropping the annot's original column name; compute_ldscores.py always keeps it); note the CM-column requirement for --snp_list (ldsc.py --l2 --print-snps needs CM and .bim-aligned annot; handled by normalize_for_ldsc; .bim must carry cM positions when using --ld-wind-cm). - [get_heritability]: document the ".results" Category "_0 / _1" suffix = position of the LD-score source in --ref-ld-chr (target = 0, baseline = 1), incl. joint-dir cases. - MWE markdown ("1.2 joint tau"): explain --snp_list is optional and orthogonal to single/joint, does not change LD-score values, and downstream commands are unchanged. - MWE postprocess cell: fix path inconsistencies (output/polyfun/heritability -> output/heritability; output/polyfun/ldscores/my_analysis_joint -> output/my_analysis_joint); replace placeholder "--target-categories anno1 anno2" with the real names (ANNOT_1_0 ANNOT_2_0) + a note on where they come from / that auto-detect works. - get_heritability & postprocess MWE: add "allm variant: --maf-cutoff 0" comment. - [global]: remove the dead, never-wired `Mref` parameter (M_ref is auto-computed from the .frq files by compute_sldsc_M_ref). Doc/MWE only — no workflow logic changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…endly target names Pass-through for pecotmr::sldsc_postprocessing_pipeline(..., target_labels=). When --target-categories-label is given (same order as --target-categories), the "target" column / tau*-block column names in the output RDS are renamed to those labels (params$target_categories holds the labels, params$target_categories_orig the original polyfun .results names). Omit it to keep the original names — no behaviour change otherwise. Updated the postprocess MWE, header, and the [meta_subset] note (the cached RDS already carries the labels). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

al4225 and others added 2 commits May 12, 2026 15:54

gaow merged commit d791d17 into StatFunGen:main May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE#1322

sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE#1322
gaow merged 2 commits into
StatFunGen:mainfrom
al4225:sldsc-enrichment-mwe-docs

al4225 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

al4225 commented May 12, 2026

Summary

Documentation

Interface

MWE fixes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants