Skip to content

Document or update split_vds_by_strata to more accurately reflect behavior #683

@mike-w-wilson

Description

@mike-w-wilson

Working on v4.0, we created the gnomad_methods function split_vds_by_strata which splits a vds based on a n expression. The desired behavior was to split a vds and maintain all alleles in each subset. This does not happen as it utilizes hail's vds.filter_samples function which unexpectedly removes all variants that are not present in a filtered sample subset despite keeping the arg remove_dead_alleles as false.

As it stands, our function does not state it will maintain or remove the dead alleles simply it will split the vds. However, we should consider updating the function so removing or keeping the dead alleles/variants is an option and it is documented.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions