I am looking for the fastest and most efficient way to sort a pretty large BAM file ( > 100GB). I am familiar with Samtools and Picard, but I’m not sure if there are any arguments in particular that I can use to optimize these commands….? Are there any other tools out there for this? Thanks in advance.
I like to use Sambamba for fast processing of BAM files. It re-implements the major operations of SAMTools using a parallel framework. You can download pre-compiled binaries and the support is good if you run into any issues.
There are some tiny differences between results from Picard, SAMTools and Sambamba in terms of sort order. These occur when arbitrary choices must be made (ie. two identical read pairs) and we haven’t found the differences to change the quality of results from downstream operations like SNV or CNV calling.