FORUMRPKM fails – out of Memory
spongemicrobiome asked 12 months ago



I am running rpkm on a set of 14 samples (roughly the same size assembly and same size sam file).   But on one of the samples i get the following error:
/var/spool/slurmd/job23090490/slurm_script: line 9: 63436 Killed                  /home/xxxx//rpkm/./rpkm_runningfile -c /home/xxxx/assembly/contig.fa -a /home/xxxx/read_mapping/sample.sam -s /home/xxx/sample_rpkmstats.txt -o /home/xxxx/sample/_rpkm.csv –m
slurmstepd: error: Detected 1 oom-kill event(s) in step 23090490.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
I ran all the job with the following specs:
#!/bin/bash
#BATCH –mem=128000M
#SBATCH –nodes=2
#SBATCH –cpus-per-task=32
#SBATCH –time=0-23:59
with 1 node, 32 CPUS, and 125G

Any ideas why one would fail and the others work?

Thanks!

1 Answers
Best Answer
jrosner Staff answered 12 months ago



Hi there,
It looks like your job is being killed because it is over the memory limit you have requested.  So, first thing to try would be to increase the memory you’re requesting… perhaps 50% more for starters.
Give it a try and let us know.
p.s. what system are you working on… Cedar? Graham? Other?

spongemicrobiome replied 12 months ago

cedar – thanks using more memory (157000) than the other samples did the trick!

jrosner Staff replied 12 months ago

Nice! glad that worked