I am running the stacks pipeline on GBS data for lake trout, and I am aligning my reads to Arctic char using BWA. Populations runs fine until it gets to locus four, after which point it says:
Error: Malformed genomic position ‘LG4q.1:29:15355:+’.
Error: Locus 29325
Error: Bad GStacks files.
I figured out, and confirmed with the developers of stacks, that there cannot be any “:” in the genome because gstacks, one of the modules of stacks, uses it as a delimiter. So, I went though my genome and changed the “:” to an “_”. I then deleted all my alignments, re-indexed the genome, re-aligned my reads and re-ran gstacks and populations. Stangely, I am still getting the same problem.
I just don’t understand how this could be the case. When I go to the fasta file that I used for indexing and type “grep -e “>LG” < genomeNoColon.fa” I see,
and when I look at my annotation file (less saal.ann) in the indexed files I see the following:
0 LG4q.1_29 (null)
165850403 90519428 61604
I guess I can deleted everything again and try again just in case I thought i deleted everything and I didn’t, but is there a way that I can check the bam files to see if there are colons in there still, and or checked my indexed genome to see if/where there are colons in there?
Can I give you my directories so that you can access the files?