Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 3147

How To Handle Reads Ending With Deletions In Gatk?

$
0
0

Hello,

before asking my question, I should point out that I'm working with data that's not my own (publicly available), to learn and establish a proper workflow when real data wlll arrive in the laboratory.

I'm dealing with some exome data[1] from an Ion Torrent 318 chip and I'm trying to run the GATK RealignerTargetCreator on it to perform recalibration later on. The problem is that some reads have a deletion at the end:

read ends with deletion. Cigar: 179S54M1D5M1I9M1D

And thus they're not processable by GATK. How to handle this case? Is the workflow I used (outlined below) to blame for this?

Steps I did:

First, QC: keep reads with a phred score of at least 20 in 80% of the bases (python script modeled over the fastx toolkit).

Then, realignment with bwa bwasw (consider that reads by Ion Torrent can go up to 250 bp):

bwa bwasw -t 8 hg19.fa C30-101.filtered.fastq > C30-101.sam

Followed by conversion to BAM, addition of RG groups, sorting, and indexing (pysamtools).

Then GATK was invoked as

 gatk -T RealignerTargetCreator -R hg19.fa -o input.bam.list -I C30-101_RG.bam

(gatk is a small wrapper that merely hides the java -Xmx -jar ... stuff.)

[1] http://lifetech-it.hosted.jivesoftware.com/docs/DOC-2659 (registration may be required)


Viewing all articles
Browse latest Browse all 3147

Trending Articles