Would anyone care to share their experience with variant calling in cancer genomics using tumor - normal pair to find somatic vs germline variants especially indels?
I have been getting an unbelievably high number of germline indels that are "coding" after running GATK somatic Indel detector on a tumor-normal samples. Even after pretty strict coverage filters both for normal and tumor, we get ~20-30 somatic coding small indels (which I can digest) but about 600 coding germline indels - ~50% of them frameshift!
These are pretty convincingly "germline" when you look at the coverage in "normal" samples (to confirm germline events). I know this cannot happen and am trying to investigate the reasons - could there be
- Alignment issues
- contamination of normal (less likely as it is blood vs paraffin tumor)
- Annotation version issues (I have rechecked and eliminated this cause)
Any help is appreciated Thanks
Additional info:
% of consensus reads with called indel in Normal by total reads in normal is ~40-50% or ~90-100% with average over all indels as 60%. Similar numbers for tumor. So it does seem like true germline