Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 3147

Genomic Alignment And Snp/Indel Calling - My First Ever "Pipeline"

$
0
0

So I have finally generated some reads and run it through what I guess could be called a very rudimentary 'pipeline'. I generated a million paired end reads with wgsim then aligned with bwa, and used samtools/bcftools/vcftools. The commands ran like this:

bwa aln -t 10 hg19 -f test.read1.sai test.read1.fq
bwa aln -t 10 hg19 -f test.read2.sai test.read2.fq
bwa sampe -f test.sam hg19 test.read1.sai test.read2.sai test.read1.fq test.read2.fq
samtools view -S -b test.sam > wgsim100bpPE.bam
samtools sort test.bam test_sort
samtools mpileup -uf ../hg19/hg19.fa test_sort.bam > test.bcf
bcftools view -vcg test.bcf - > test.vcf

Now I know this is simplistic and I know other tools are available, and I will try more in the future. What I would like to know is if there are any other steps or parameters I should be using with this existing workflow to improve it? e.g. make it more efficient. Also, if I wanted to run 150 samples, would I just run this 150 times in a row, or would I want to do things differently.

Any help would be appreciated. And please be kind ;)


Viewing all articles
Browse latest Browse all 3147

Trending Articles