[Question] Is PCR deduplication (MarkDuplicates) required for matched DNA BAM files?

Hi REDItools developers,

I am currently using REDItools (REDItools /3) to detect RNA editing events by analyzing RNA-seq data alongside matched DNA (WGS/WES) data. My goal is to use the DNA data to annotate and filter out genomic variants (SNPs) from the RNA editing results.
I have reviewed the official documentation/README, and while it specifies requirements for mapping quality and base quality, it does not explicitly state whether the input DNA BAM files should undergo PCR deduplication (e.g., using `picard MarkDuplicates` or `samtools markdup`).

**My question is:**
Is it strictly recommended to remove PCR duplicates from the DNA BAM files before running REDItools?
My concern is that if PCR duplicates are retained, they might artificially inflate the read counts (`gBaseCount`) and allele frequencies (`gFrequency`) in the DNA data, potentially affecting the accuracy when distinguishing between true RNA editing events and genomic SNPs.
Could you please clarify the best practice for preparing the DNA BAM input?
Thank you for your time and for developing this great tool!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Is PCR deduplication (MarkDuplicates) required for matched DNA BAM files? #63

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Question] Is PCR deduplication (MarkDuplicates) required for matched DNA BAM files? #63

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions