Skip to content

[Question] Is PCR deduplication (MarkDuplicates) required for matched DNA BAM files? #63

@baiyin051

Description

@baiyin051

Hi REDItools developers,

I am currently using REDItools (REDItools /3) to detect RNA editing events by analyzing RNA-seq data alongside matched DNA (WGS/WES) data. My goal is to use the DNA data to annotate and filter out genomic variants (SNPs) from the RNA editing results.
I have reviewed the official documentation/README, and while it specifies requirements for mapping quality and base quality, it does not explicitly state whether the input DNA BAM files should undergo PCR deduplication (e.g., using picard MarkDuplicates or samtools markdup).

My question is:
Is it strictly recommended to remove PCR duplicates from the DNA BAM files before running REDItools?
My concern is that if PCR duplicates are retained, they might artificially inflate the read counts (gBaseCount) and allele frequencies (gFrequency) in the DNA data, potentially affecting the accuracy when distinguishing between true RNA editing events and genomic SNPs.
Could you please clarify the best practice for preparing the DNA BAM input?
Thank you for your time and for developing this great tool!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions