sldsc: fix compute_sldsc_M_ref reference panel; add target_labels relabeling by al4225 · Pull Request #489 · StatFunGen/pecotmr

al4225 · 2026-05-12T20:17:18Z

Summary

Two changes to the sLDSC post-processing in R/sldsc_wrapper.R:

1. `compute_sldsc_M_ref` — count the reference panel, not the regression set

M_ref is the reference-panel SNP count over which partitioned heritability is defined:
h²(C) = Σ_{j∈M_ref} a_C(j)·Σ_{C'} τ_{C'}·a_{C'}(j). It is panel-defined (it matches polyfun's .l2.M for allm / .l2.M_5_50 for m50), not the regression / HapMap3 SNP set (~1M).

Previously, under maf_cutoff == 0 the function counted the .l2.ldscore rows of target_anno_dir. For snplist runs that directory holds the HM3 regression set (~1M), not the reference panel (~8M) — so allm + snplist runs reported a ~8× too-small M_ref, and therefore a ~8× too-small τ* / EnrichStat (e.g. meta τ* 0.018 instead of 0.137).

Now it counts the reference panel from the .frq files:

maf_cutoff == 0 → all rows (matches polyfun .l2.M)
maf_cutoff > 0 → MAF > cutoff rows (matches polyfun .l2.M_5_50)
target_anno_dir is kept only as a fallback (with a warning) when no .frq dir is supplied.

Enrichment is unaffected (polyfun computes it independently of M_ref); only τ* / EnrichStat standardization changes. m50 / m50_snplist runs were already correct.

2. `sldsc_postprocessing_pipeline` — new optional `target_labels` argument

When given (same length & order as the resolved target_categories), every target column and tau*-block column name in the output is renamed to those labels; params$target_categories then holds the labels and params$target_categories_orig keeps the original polyfun .results names (ANNOT_0, L2_0, …). When NULL (default) nothing is renamed — original behaviour. Lets downstream output read e.g. quantile_eQTL instead of ANNOT_0.

Testing

End-to-end on the updated_pipeline_by_gao/test MWE (postprocess + meta_subset), both with and without --target-categories-label:

with label → params$target_categories = quantile_eQTL, target_categories_orig = ANNOT_0, all target columns relabeled, meta_subset inherits the label;
without label → unchanged (ANNOT_0, no target_categories_orig);
numeric values (enrichment, τ*) identical between the two — relabel is cosmetic only.
M_ref fix verified on ADSP_allm_snplist: M_ref 1.04M → 8.13M, meta τ* 0.018 → 0.137, Enrichment unchanged.

man/compute_sldsc_M_ref.Rd and man/sldsc_postprocessing_pipeline.Rd regenerated via roxygen2.

🤖 Generated with Claude Code

…abeling compute_sldsc_M_ref: M_ref is the REFERENCE-PANEL SNP count over which heritability is partitioned (h2(C) = sum_{j in M_ref} a_C(j) sum_{C'} tau_{C'} a_{C'}(j)) — it is panel- defined, not the regression/HapMap3 SNP set. Previously, under maf_cutoff == 0 it counted the .l2.ldscore rows of target_anno_dir, which is the HM3 regression set (~1M) for snplist runs instead of the reference panel (~8M) — making "allm + snplist" runs report a ~8x too-small M_ref and hence a ~8x too-small tau*. Now it counts the reference panel from the .frq files: all rows when maf_cutoff == 0 (matches polyfun's .l2.M), MAF > cutoff rows when maf_cutoff > 0 (matches polyfun's .l2.M_5_50). target_anno_dir is kept only as a fallback (with a warning) when no .frq dir is given. Enrichment is unchanged (polyfun computes it without M_ref); only tau* / EnrichStat standardization is affected. m50 / m50_snplist were already correct. sldsc_postprocessing_pipeline: new optional `target_labels` argument. When given (same length & order as the resolved target_categories), every "target" column and tau*-block column name in the output is renamed to those labels; params$target_categories then holds the labels and params$target_categories_orig keeps the original polyfun .results names. When NULL (default) nothing is renamed — original behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

al4225 mentioned this pull request May 12, 2026

sldsc_enrichment: document snplist vs no-snplist path; add --target-categories-label; fix MWE StatFunGen/xqtl-protocol#1322

Merged

gaow merged commit 177735b into StatFunGen:main May 13, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sldsc: fix compute_sldsc_M_ref reference panel; add target_labels relabeling#489

sldsc: fix compute_sldsc_M_ref reference panel; add target_labels relabeling#489
gaow merged 1 commit into
StatFunGen:mainfrom
al4225:sldsc-mref-fix-and-target-labels

al4225 commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

al4225 commented May 12, 2026

Summary

1. compute_sldsc_M_ref — count the reference panel, not the regression set

2. sldsc_postprocessing_pipeline — new optional target_labels argument

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `compute_sldsc_M_ref` — count the reference panel, not the regression set

2. `sldsc_postprocessing_pipeline` — new optional `target_labels` argument