only using target question #992
Replies: 6 comments
-
|
And the BAM file I used has already undergone GATK's BQSR and had duplicates marked. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
I tried to add antitarget, but it seems that the changes in the reference.cnn are not significant. The average log2 value of the part where log2 is greater than 0 is still around 0.05, while the average log2 value of the part where log2 is less than 0 is greater than -0.13. Should I discard the reference built with only 9 mismatched normal samples and use flat.cnn instead? |
Beta Was this translation helpful? Give feedback.
-
|
Here's a starting point in the docs: The skew in log2 values is not surprising; there tend to be more genomic regions with poor sequencing coverage than with abnormally high sequencing coverage. The "--drop-low-coverage" option may help with that. There are a lot of other reasons why coverage depth can vary across the genome; reference log2 values drifting away from 0.0 is not necessarily incorrect if the bias is consistent across sequenced samples and not due to real copy number variation in the control samples. You may also try masking out problematic genomic regions with |
Beta Was this translation helpful? Give feedback.
-
|
Thank you very much for your reply. I think I have successfully manually processed the bed file after autobin, and I have ensured that the average depth of fragments is greater than 150x. |
Beta Was this translation helpful? Give feedback.
-
|
Tough to say. I'd recommend trying it both ways and reviewing the results to see if either approach is clearly better for your data/assay. Many labs doing target enrichment (with baits, not TAS) will skip antitargets if they have very high on-target rates, since the remaining off-target reads are so thin the antitarget bins would have to be huge. But -- it's comples, so try an empirical approach. Generally, if you have stable results from CNVkit they should work well with PureCN, but I can't speak to the details. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
Hello and thank you for the software developed by you and your team. I currently have an issue that I haven't found answers to elsewhere.
My data is NGS sequencing data of target hybrid panel, with only 9 mismatched normal samples. After using cnvkit to split the original bed file for normal samples, I deleted some segments based on depth and the sequencing quality of those segments (for example, if a bed segment was divided into three parts, I might have only deleted the middle part due to sequencing quality issues). Then, I reconstructed the reference using the bed file after deleting some segments and proceeded with subsequent steps. However, in the resulting reference, the average value of log2 ratios greater than 0 is only 0.04075515, while the average value of log2 ratios less than 0 is -0.1360878. I believe this leads to very low log2 > 0 values in tumor sample detection, which prevents me from setting a threshold to obtain amplicon results.
It's worth noting that I only used the target and did not use the antitarget. My reasoning is that there are many mapping failures and poor-quality segments in the non-target regions. Could you please advise whether I should still use the antitarget? Or should I correct the log2 ratios in the reference?
Best regards,
Beta Was this translation helpful? Give feedback.
All reactions