-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hi REDItools developers,
I am currently using REDItools (REDItools /3) to detect RNA editing events by analyzing RNA-seq data alongside matched DNA (WGS/WES) data. My goal is to use the DNA data to annotate and filter out genomic variants (SNPs) from the RNA editing results.
I have reviewed the official documentation/README, and while it specifies requirements for mapping quality and base quality, it does not explicitly state whether the input DNA BAM files should undergo PCR deduplication (e.g., using picard MarkDuplicates or samtools markdup).
My question is:
Is it strictly recommended to remove PCR duplicates from the DNA BAM files before running REDItools?
My concern is that if PCR duplicates are retained, they might artificially inflate the read counts (gBaseCount) and allele frequencies (gFrequency) in the DNA data, potentially affecting the accuracy when distinguishing between true RNA editing events and genomic SNPs.
Could you please clarify the best practice for preparing the DNA BAM input?
Thank you for your time and for developing this great tool!