FORUMRPKM fails – out of Memory
spongemicrobiome asked 2 months ago



I am running rpkm on a set of 14 samples (roughly the same size assembly and same size sam file).   But on one of the samples i get the following error:
/var/spool/slurmd/job23090490/slurm_script: line 9: 63436 Killed                  /home/xxxx//rpkm/./rpkm_runningfile -c /home/xxxx/assembly/contig.fa -a /home/xxxx/read_mapping/sample.sam -s /home/xxx/sample_rpkmstats.txt -o /home/xxxx/sample/_rpkm.csv –m
slurmstepd: error: Detected 1 oom-kill event(s) in step 23090490.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
I ran all the job with the following specs:
#!/bin/bash
#BATCH –mem=128000M
#SBATCH –nodes=2
#SBATCH –cpus-per-task=32
#SBATCH –time=0-23:59
with 1 node, 32 CPUS, and 125G

Any ideas why one would fail and the others work?

Thanks!

1 Answers
Best Answer
jrosner Staff answered 2 months ago



Hi there,
It looks like your job is being killed because it is over the memory limit you have requested.  So, first thing to try would be to increase the memory you’re requesting… perhaps 50% more for starters.
Give it a try and let us know.
p.s. what system are you working on… Cedar? Graham? Other?

spongemicrobiome replied 2 months ago

cedar – thanks using more memory (157000) than the other samples did the trick!

jrosner Staff replied 2 months ago

Nice! glad that worked