FORUMRunning Busco on Cedar
sjossey asked 10 months ago



Hi,

I am trying to run busco on Cedar. I loaded the module and am trying to run with the following command
$ python BUSCO.py -hpython: can\’t open file \’BUSCO.py\’:
[Errno 2] No such file or directory

So I tried,  $ /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -h usage: python BUSCO.py -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
Welcome to BUSCO 3.0.2: the Benchmarking Universal Single-Copy Ortholog assessment tool.For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. optional arguments:-i FASTA FILE, –in FASTA FILE…. …

That shows the help page  Now then if I try to run busco I get the following error
$ /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -i genome.fasta -o Assemb  -l mammalia_odb9  -m geno ERROR No section [busco] found in /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/../config/config.ini. Please make sure both the file and this section exist, see userguide.

Please help if anybody was able to run busco on Cedar/Graham.

Thanks, Sushma

5 Answers
Best Answer
sjossey answered 10 months ago



I copied the file config.ini.default from  /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/config into my scratch folder and renamed it config.ini 
Now from that folder I tried to run again got a new error
$ python /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -i /genome.fasta -o Assemb -l mammalia_odb9 -m geno

Traceback (most recent call last):
File “/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py”, line 26, in <module>
from pipebricks.PipeLogger import PipeLogger
ImportError: No module named pipebricks.PipeLogger
Thanks,
Sushma

Eloi Mercier replied 10 months ago

It looks like it’s looking for a specific python module.

What version of python are you using?
python –version

Eloi Mercier replied 10 months ago

A user reported that using python 3 worked for him:
https://gitlab.com/ezlab/busco/issues/27

Can you try with python3?

sjossey replied 10 months ago

I did not load any python as I only loaded what was required by the module
$module load nixpkgs/16.09 gcc/5.4.0 openmpi/2.0.2 busco/3.0.2

sjossey replied 10 months ago

I tried again after loading python/3.7.0..the same error

Eloi Mercier replied 10 months ago

Can you run BUSCO on the test data please:
run_BUSCO.py -i /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/target.fa -o TEST -l /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/example/ -m genome

This ran without issue once I configure everything properly (see my next comment).

Eloi Mercier replied 10 months ago

There is some configuration required before running BUSCO. Here is what I did. Perhaps it’s not the most elegant solution but it worked for me.

1. Create a local copy of Augustus. Otherwise, if we use the version on cvmfs it will complain that it cannot write in augustus/config (“Cannot write to Augustus config path”):
cp -r /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/intel2016.4/augustus/ YOUR_WORKING_DIR

2. Configure your config file properly. This should work on Cedar:
# BUSCO specific configuration
# It overrides default values in code and dataset cfg, and is overridden by arguments in command line
# Uncomment lines when appropriate
[busco]
# Input file
;in = ./sample_data/target.fa
# Run name, used in output files and folder
;out = SAMPLE
# Where to store the output directory
;out_path = ./sample_data
# Path to the BUSCO dataset
;lineage_path = ./sample_data/example
# Which mode to run (genome / protein / transcriptome)
;mode = genome
# How many threads to use for multithreaded steps
;cpu = 1
# Domain for augustus retraining, eukaryota or prokaryota
# Do not change this unless you know exactly why !!!
;domain = eukaryota
# Force rewrite if files already exist (True/False)
;force = False
# Restart mode (True/False)
;restart = False
# Blast e-value
;evalue = 1e-3
# Species to use with augustus, for old datasets only
;species = fly
# Augustus extra parameters
# Use single quotes, like this: ‘–param1=1 –param2=2’
;augustus_parameters = ”
# Tmp folder
;tmp_path = ./tmp/
# How many candidate regions (contigs, scaffolds) to consider for each BUSCO
;limit = 3
# Augustus long mode for retraining (True/False)
;long = False
# Quiet mode (True/False)
;quiet = False
# Debug logs (True/False), it needs Quiet to be False
;debug = True
# tar gzip output files (True/False)
;gzip = False
# Force single core for the tblastn step
;blast_single_core = True

[tblastn]
# path to tblastn
path = /cvmfs/soft.mugqic/CentOS6/software/blast/ncbi-blast-2.3.0+/bin/

[makeblastdb]
# path to makeblastdb
path = /cvmfs/soft.mugqic/CentOS6/software/blast/ncbi-blast-2.3.0+/bin/

[augustus]
# path to augustus
path = YOUR_WORKING_DIR/augustus/3.3/bin

[etraining]
# path to augustus etraining
path = YOUR_WORKING_DIR/augustus/3.3/bin/

# path to augustus perl scripts, redeclare it for each new script
[gff2gbSmallDNA.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts
[new_species.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts
[optimize_augustus.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts

[hmmsearch]
# path to HMMsearch executable
path = /cvmfs/soft.mugqic/CentOS6/software/hmmer/hmmer-3.1b2/bin/

[Rscript]
# path to Rscript, if you wish to use the plot tool
path = /cvmfs/soft.mugqic/CentOS6/software/R_Bioconductor/R_Bioconductor-3.4.2_3.6/bin/

3. Set up the environment variables:
export AUGUSTUS_CONFIG_PATH=YOUR_WORKING_DIR/augustus/config/
export BUSCO_CONFIG_FILE=YOUR_WORKING_DIR/config.ini

You should then be able to run BUSCO:
run_BUSCO.py -i /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/target.fa -o TEST -l /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/example/ -m genome

sjossey replied 10 months ago

Thank you very much it seems to work on the test ..only thing I changed was while setting environment
export AUGUSTUS_CONFIG_PATH=YOUR_WORKING_DIR/augustus/3.3/config
Sushma

Eloi Mercier replied 10 months ago

I’m glad it works!

Eloi

Eloi Mercier answered 10 months ago



Hi Sushma,
could you show me what command you used to load the module please?
Eloi

sjossey answered 10 months ago



Hi Eloi,
I loaded using
$module load nixpkgs/16.09 gcc/5.4.0 openmpi/2.0.2 busco/3.0.2
Thanks,
Sushma

jhgalvez Staff answered 10 months ago



You need to create the config file, even if it is empty. 
Since you don’t have write access on CVMFS, you need to create it on a directory that you have access to and then define the environmental variable BUSCO_CONFIG_FILE. Otherwise the program won’t work, according to the documentation https://gitlab.com/ezlab/busco
“You can set the BUSCO_CONFIG_FILE
environment variable to define a custom path (including the filename) to the config.ini file,
useful for switching between configurations or in a multi-users environment.”
 
 

Eloi Mercier replied 10 months ago

On top of that, to run BUSCO, you simply need to type:
$ run_BUSCO.py
(after loading the module)

sjossey replied 10 months ago

I will try to do that.
Thanks,
Sushma

jrosner Staff answered 9 months ago



Thread continues in another post

Continued from the question….RUNNING BUSCO ON CEDAR