sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE#1322
Merged
Merged
Conversation
…x MWE
- [make_annotation_files_ldscore]: explain --snp_list -> ldsc.py --l2 --print-snps
(output .l2.ldscore.gz) vs no --snp_list -> compute_ldscores.py (.l2.ldscore.parquet);
document the root cause of "L2_0" vs "ANNOT_0" in .results (ldsc.py hard-codes the
LD-score column name to "L2" when n_annot == 1, dropping the annot's original column
name; compute_ldscores.py always keeps it); note the CM-column requirement for
--snp_list (ldsc.py --l2 --print-snps needs CM and .bim-aligned annot; handled by
normalize_for_ldsc; .bim must carry cM positions when using --ld-wind-cm).
- [get_heritability]: document the ".results" Category "_0 / _1" suffix = position of
the LD-score source in --ref-ld-chr (target = 0, baseline = 1), incl. joint-dir cases.
- MWE markdown ("1.2 joint tau"): explain --snp_list is optional and orthogonal to
single/joint, does not change LD-score values, and downstream commands are unchanged.
- MWE postprocess cell: fix path inconsistencies (output/polyfun/heritability ->
output/heritability; output/polyfun/ldscores/my_analysis_joint -> output/my_analysis_joint);
replace placeholder "--target-categories anno1 anno2" with the real names
(ANNOT_1_0 ANNOT_2_0) + a note on where they come from / that auto-detect works.
- get_heritability & postprocess MWE: add "allm variant: --maf-cutoff 0" comment.
- [global]: remove the dead, never-wired `Mref` parameter (M_ref is auto-computed from
the .frq files by compute_sldsc_M_ref).
Doc/MWE only — no workflow logic changed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…endly target names Pass-through for pecotmr::sldsc_postprocessing_pipeline(..., target_labels=). When --target-categories-label is given (same order as --target-categories), the "target" column / tau*-block column names in the output RDS are renamed to those labels (params$target_categories holds the labels, params$target_categories_orig the original polyfun .results names). Omit it to keep the original names — no behaviour change otherwise. Updated the postprocess MWE, header, and the [meta_subset] note (the cached RDS already carries the labels). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Documentation + small interface additions to
code/enrichment/sldsc_enrichment.ipynb(pairs with StatFunGen/pecotmr#489).Documentation
[make_annotation_files_ldscore]— explain the snplist vs no-snplist split:--snp_listroutes toldsc.py --l2 --print-snps(output.l2.ldscore.gz), absent routes tocompute_ldscores.py(output.l2.ldscore.parquet). Root cause of theL2_0vsANNOT_0.resultsCategory name:ldsc.pyhard-codes the LD-score column to"L2"when there is exactly one annotation (dropping the annot's column name), whilecompute_ldscores.pyalways keeps it. Also note the snplist-only requirement that the target annot carry aCMcolumn and line up row-for-row with the plink.bim.[get_heritability]— explain the.resultsCategory_0/_1suffix: each polyfun call gets exactly two--ref-ld-chrsources (target = index 0, baseline = index 1), so target annots come out asANNOT_0/L2_0/ANNOT_1_0… and baseline asbase_1,Coding_UCSC_1, …--target-categories(if passed explicitly) must match this exactly.[postprocess]/ section 3 markdown — document the new--target-categories-label.Interface
[postprocess]— new--target-categories-labelparameter (same order as--target-categories), passed through topecotmr::sldsc_postprocessing_pipeline(target_labels=). Renames thetargetcolumns /tau*-block names in the output RDS to friendly names (e.g.ANNOT_1_0 ANNOT_2_0→quantile_eQTL eQTL); originals kept inparams$target_categories_orig. Omit to keep the polyfun.resultsnames — unchanged default behaviour.parameter: Mref = -1from[global](no longer read by anything).MWE fixes
output/heritability,output/my_analysis_joint), changed the illustrative--target-categories anno1 anno2to real.resultsCategory names (ANNOT_1_0 ANNOT_2_0), added anallmnote (--maf-cutoff 0).Testing
Ran the last steps of the
updated_pipeline_by_gao/testMWE (postprocess+meta_subset) against this notebook + the pecotmr PR branch, with and without--target-categories-label: relabel works (params$target_categories = quantile_eQTL,target_categories_orig = ANNOT_0,meta_subsetinherits the label), default path unchanged, numeric values identical between the two.🤖 Generated with Claude Code