FORUMSTAR genome load shared memory feature
OmriNach asked 4 months ago

Is it possible to run a STAR alignment job using the shared memory option –genomeLoad LoadAndExit etc on the Graham cluster? The documentation mentions I may need sys administrator help to use that feature.
On a separate note, my STAR runs are taking a considerably long time to load the SA file, using 50Gb Mem (to be safe) and 20cores, any idea why?


Currently taking me over 40minutes to load the reference genome, seems to get slowed down on the Loading SA part, wondering what I am doing wrong? Using 20CPU and 35Gb RAM

1 Answers
flefebvre Staff answered 4 months ago

Hi OmriNach, it would be good if you could provide a precise description of what you are actually doing, i.e. the commands. 
However, assuming you are trying to align one sample of reasonable size to GRCh38, here are some parameters we successfully applied in the past:
For the STAR alignReads call:
–genomeDir /cvmfs/soft.mugqic/CentOS6/genomes/species/Homo_sapiens.GRCh38/genome/star_index/Ensembl87.sjdbOverhang99 \
–runThreadN 16 \
–readFilesCommand zcat \
–outStd Log \
–outSAMunmapped Within \
–outSAMtype BAM Unsorted \
–limitGenomeGenerateRAM 100000000000 \
–limitIObufferSize 4000000000
For the batch job:

–time=24:00:0 –mem=128G -N 1 -n 20

STAR takes roughly 30 minutes per sample using these parameters, but it is also very greedy; one needs to reserve enough resources as above.
I would say there should be no need to use the shared memory option. We process a lot of RNA-seq samples each year certainly don’t need this feature.
One thing to watch out for on Graham might be the CVMFS cache not being tuned properly. We will inquire an get back to you if applicable.

OmriNach replied 4 months ago

Hi there, I had someone more experienced them in inquire about this. He determined the read speed from the cvmfs is around 7MB/S where it should be close to 1.4GB/s. Is this an issue with the Graham?

flefebvre Staff replied 4 months ago

Hi there, I was told the performance of the /cvmfs/ref.mugqic stack on Graham is not the best too. If you still want to use our pre-built STAR indices on Graham, I would suggest making a copy of the index file somewhere in your project space and pointing all of your jobs there instead of /cvmfs

OmriNach replied 4 months ago

Thanks, will try that!