getting an error in my populations output file saying that I have a malformed genomic locus
Ella Bowles asked 1 year ago

I am running the stacks pipeline on GBS data for lake trout, and I am aligning my reads to Arctic char using BWA. Populations runs fine until it gets to locus four, after which point it says:
Now processing…
Error: Malformed genomic position ‘LG4q.1:29:15355:+’.
Error: Locus 29325
Error: Bad GStacks files.

I figured out, and confirmed with the developers of stacks, that there cannot be any “:” in the genome because gstacks, one of the modules of stacks, uses it as a delimiter. So, I went though my genome and changed the “:” to an “_”. I then deleted all my alignments, re-indexed the genome, re-aligned my reads and re-ran gstacks and populations. Stangely, I am still getting the same problem.

I just don’t understand how this could be the case. When I go to the fasta file that I used for indexing and type “grep -e “>LG” < genomeNoColon.fa” I see,


and when I look at my annotation file (less saal.ann) in the indexed files I see the following:

0 LG4q.1_29 (null)
165850403 90519428 61604

I guess I can deleted everything again and try again just in case I thought i deleted everything and I didn’t, but is there a way that I can check the bam files to see if there are colons in there still, and or checked my indexed genome to see if/where there are colons in there?

Can I give you my directories so that you can access the files?

With thanks,

Ella Bowles answered 1 year ago

Oh boy, this post can be deleted. I just found the problem. So silly. For some reason I had my input directory set to my old alignments, which had problematic locus headers.

jrosner Staff replied 1 year ago

Hi Ella, glad you figured it out!