Hi, I have received Illumina paired ends sequencing as bam files on the human genome. Now I need to submit the corresponding Fastq files to a public repository but it is giving me errors about duplicate reads in my fastqs. I converted to fastq using the command bedtools bamtofastq -i <BAM> -fq <FASTQ>. The version of Bedtools was v2.25.0.I am concerned that the original bam files had problems. The genome center who generated the data is telling me to use Picard but I do not understand how this would change anything.Can you help me with this? thanks
The duplicate names you’re seeing may be a result of secondary alignments. Instead of using Bedtools, try using Picard’s « SamtoFastq ». Setting the option « INCLUDE_NON_PRIMARY_ALIGNMENTS » to False might solve your problems. Let me know if this works.
That was the problem thank you!