Skip to content

sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE#1322

Merged
gaow merged 2 commits into
StatFunGen:mainfrom
al4225:sldsc-enrichment-mwe-docs
May 13, 2026
Merged

sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE#1322
gaow merged 2 commits into
StatFunGen:mainfrom
al4225:sldsc-enrichment-mwe-docs

Conversation

@al4225
Copy link
Copy Markdown
Contributor

@al4225 al4225 commented May 12, 2026

Summary

Documentation + small interface additions to code/enrichment/sldsc_enrichment.ipynb (pairs with StatFunGen/pecotmr#489).

Documentation

  • [make_annotation_files_ldscore] — explain the snplist vs no-snplist split: --snp_list routes to ldsc.py --l2 --print-snps (output .l2.ldscore.gz), absent routes to compute_ldscores.py (output .l2.ldscore.parquet). Root cause of the L2_0 vs ANNOT_0 .results Category name: ldsc.py hard-codes the LD-score column to "L2" when there is exactly one annotation (dropping the annot's column name), while compute_ldscores.py always keeps it. Also note the snplist-only requirement that the target annot carry a CM column and line up row-for-row with the plink .bim.
  • [get_heritability] — explain the .results Category _0 / _1 suffix: each polyfun call gets exactly two --ref-ld-chr sources (target = index 0, baseline = index 1), so target annots come out as ANNOT_0 / L2_0 / ANNOT_1_0… and baseline as base_1, Coding_UCSC_1, … --target-categories (if passed explicitly) must match this exactly.
  • [postprocess] / section 3 markdown — document the new --target-categories-label.

Interface

  • [postprocess] — new --target-categories-label parameter (same order as --target-categories), passed through to pecotmr::sldsc_postprocessing_pipeline(target_labels=). Renames the target columns / tau*-block names in the output RDS to friendly names (e.g. ANNOT_1_0 ANNOT_2_0quantile_eQTL eQTL); originals kept in params$target_categories_orig. Omit to keep the polyfun .results names — unchanged default behaviour.
  • Dropped the dead parameter: Mref = -1 from [global] (no longer read by anything).

MWE fixes

  • Corrected output paths in the post-processing MWE cell (output/heritability, output/my_analysis_joint), changed the illustrative --target-categories anno1 anno2 to real .results Category names (ANNOT_1_0 ANNOT_2_0), added an allm note (--maf-cutoff 0).

Testing

Ran the last steps of the updated_pipeline_by_gao/test MWE (postprocess + meta_subset) against this notebook + the pecotmr PR branch, with and without --target-categories-label: relabel works (params$target_categories = quantile_eQTL, target_categories_orig = ANNOT_0, meta_subset inherits the label), default path unchanged, numeric values identical between the two.

🤖 Generated with Claude Code

al4225 and others added 2 commits May 12, 2026 15:54
…x MWE

- [make_annotation_files_ldscore]: explain --snp_list -> ldsc.py --l2 --print-snps
  (output .l2.ldscore.gz) vs no --snp_list -> compute_ldscores.py (.l2.ldscore.parquet);
  document the root cause of "L2_0" vs "ANNOT_0" in .results (ldsc.py hard-codes the
  LD-score column name to "L2" when n_annot == 1, dropping the annot's original column
  name; compute_ldscores.py always keeps it); note the CM-column requirement for
  --snp_list (ldsc.py --l2 --print-snps needs CM and .bim-aligned annot; handled by
  normalize_for_ldsc; .bim must carry cM positions when using --ld-wind-cm).
- [get_heritability]: document the ".results" Category "_0 / _1" suffix = position of
  the LD-score source in --ref-ld-chr (target = 0, baseline = 1), incl. joint-dir cases.
- MWE markdown ("1.2 joint tau"): explain --snp_list is optional and orthogonal to
  single/joint, does not change LD-score values, and downstream commands are unchanged.
- MWE postprocess cell: fix path inconsistencies (output/polyfun/heritability ->
  output/heritability; output/polyfun/ldscores/my_analysis_joint -> output/my_analysis_joint);
  replace placeholder "--target-categories anno1 anno2" with the real names
  (ANNOT_1_0 ANNOT_2_0) + a note on where they come from / that auto-detect works.
- get_heritability & postprocess MWE: add "allm variant: --maf-cutoff 0" comment.
- [global]: remove the dead, never-wired `Mref` parameter (M_ref is auto-computed from
  the .frq files by compute_sldsc_M_ref).

Doc/MWE only — no workflow logic changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…endly target names

Pass-through for pecotmr::sldsc_postprocessing_pipeline(..., target_labels=).
When --target-categories-label is given (same order as --target-categories), the
"target" column / tau*-block column names in the output RDS are renamed to those
labels (params$target_categories holds the labels, params$target_categories_orig
the original polyfun .results names). Omit it to keep the original names — no
behaviour change otherwise. Updated the postprocess MWE, header, and the
[meta_subset] note (the cached RDS already carries the labels).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gaow gaow merged commit d791d17 into StatFunGen:main May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants