FORUMCatégorie: BioinformaticsShuffle reads?
flefebvre personnel demandée il y a 3 ans



Hi, a colleague told me it is important to shuffle reads when converting a bam file to fastq. What is your take on this?

1 Réponses
Best Answer
jflucier personnel répondue il y a 3 ans



BAM files are ordered alignment of reads.
The aligner uses blocks of paired reads to estimate the insert size. If you don’t shuffle your original bam, the blocks of insert size will not be randomly distributed across the genome, rather they will all come from the same region, biasing the insert size calculation. This is a very important step which is unfortunately often overlooked.
See https://gatkforums.broadinstitute.org/gatk/discussion/2908/howto-revert-a-bam-file-to-fastq-format for more information.

flefebvre personnel répondue il y a 3 ans

Excellent, thank you.