FORUMRunning Busco on Cedar
sjossey demandée il y a 8 mois



Hi,

I am trying to run busco on Cedar. I loaded the module and am trying to run with the following command
$ python BUSCO.py -hpython: can\’t open file \’BUSCO.py\’:
[Errno 2] No such file or directory

So I tried,  $ /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -h usage: python BUSCO.py -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
Welcome to BUSCO 3.0.2: the Benchmarking Universal Single-Copy Ortholog assessment tool.For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. optional arguments:-i FASTA FILE, –in FASTA FILE…. …

That shows the help page  Now then if I try to run busco I get the following error
$ /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -i genome.fasta -o Assemb  -l mammalia_odb9  -m geno ERROR No section [busco] found in /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/../config/config.ini. Please make sure both the file and this section exist, see userguide.

Please help if anybody was able to run busco on Cedar/Graham.

Thanks, Sushma

5 Réponses
Best Answer
sjossey répondue il y a 8 mois



I copied the file config.ini.default from  /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/config into my scratch folder and renamed it config.ini 
Now from that folder I tried to run again got a new error
$ python /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py -i /genome.fasta -o Assemb -l mammalia_odb9 -m geno

Traceback (most recent call last):
File « /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/scripts/run_BUSCO.py », line 26, in <module>
from pipebricks.PipeLogger import PipeLogger
ImportError: No module named pipebricks.PipeLogger
Thanks,
Sushma

Eloi Mercier répondue il y a 8 mois

It looks like it’s looking for a specific python module.

What version of python are you using?
python –version

Eloi Mercier répondue il y a 8 mois

A user reported that using python 3 worked for him:
https://gitlab.com/ezlab/busco/issues/27

Can you try with python3?

sjossey répondue il y a 8 mois

I did not load any python as I only loaded what was required by the module
$module load nixpkgs/16.09 gcc/5.4.0 openmpi/2.0.2 busco/3.0.2

sjossey répondue il y a 8 mois

I tried again after loading python/3.7.0..the same error

Eloi Mercier répondue il y a 8 mois

Can you run BUSCO on the test data please:
run_BUSCO.py -i /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/target.fa -o TEST -l /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/example/ -m genome

This ran without issue once I configure everything properly (see my next comment).

Eloi Mercier répondue il y a 8 mois

There is some configuration required before running BUSCO. Here is what I did. Perhaps it’s not the most elegant solution but it worked for me.

1. Create a local copy of Augustus. Otherwise, if we use the version on cvmfs it will complain that it cannot write in augustus/config (« Cannot write to Augustus config path »):
cp -r /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/intel2016.4/augustus/ YOUR_WORKING_DIR

2. Configure your config file properly. This should work on Cedar:
# BUSCO specific configuration
# It overrides default values in code and dataset cfg, and is overridden by arguments in command line
# Uncomment lines when appropriate
[busco]
# Input file
;in = ./sample_data/target.fa
# Run name, used in output files and folder
;out = SAMPLE
# Where to store the output directory
;out_path = ./sample_data
# Path to the BUSCO dataset
;lineage_path = ./sample_data/example
# Which mode to run (genome / protein / transcriptome)
;mode = genome
# How many threads to use for multithreaded steps
;cpu = 1
# Domain for augustus retraining, eukaryota or prokaryota
# Do not change this unless you know exactly why !!!
;domain = eukaryota
# Force rewrite if files already exist (True/False)
;force = False
# Restart mode (True/False)
;restart = False
# Blast e-value
;evalue = 1e-3
# Species to use with augustus, for old datasets only
;species = fly
# Augustus extra parameters
# Use single quotes, like this: ‘–param1=1 –param2=2’
;augustus_parameters =  »
# Tmp folder
;tmp_path = ./tmp/
# How many candidate regions (contigs, scaffolds) to consider for each BUSCO
;limit = 3
# Augustus long mode for retraining (True/False)
;long = False
# Quiet mode (True/False)
;quiet = False
# Debug logs (True/False), it needs Quiet to be False
;debug = True
# tar gzip output files (True/False)
;gzip = False
# Force single core for the tblastn step
;blast_single_core = True

[tblastn]
# path to tblastn
path = /cvmfs/soft.mugqic/CentOS6/software/blast/ncbi-blast-2.3.0+/bin/

[makeblastdb]
# path to makeblastdb
path = /cvmfs/soft.mugqic/CentOS6/software/blast/ncbi-blast-2.3.0+/bin/

[augustus]
# path to augustus
path = YOUR_WORKING_DIR/augustus/3.3/bin

[etraining]
# path to augustus etraining
path = YOUR_WORKING_DIR/augustus/3.3/bin/

# path to augustus perl scripts, redeclare it for each new script
[gff2gbSmallDNA.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts
[new_species.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts
[optimize_augustus.pl]
path = YOUR_WORKING_DIR/augustus/3.3/scripts

[hmmsearch]
# path to HMMsearch executable
path = /cvmfs/soft.mugqic/CentOS6/software/hmmer/hmmer-3.1b2/bin/

[Rscript]
# path to Rscript, if you wish to use the plot tool
path = /cvmfs/soft.mugqic/CentOS6/software/R_Bioconductor/R_Bioconductor-3.4.2_3.6/bin/

3. Set up the environment variables:
export AUGUSTUS_CONFIG_PATH=YOUR_WORKING_DIR/augustus/config/
export BUSCO_CONFIG_FILE=YOUR_WORKING_DIR/config.ini

You should then be able to run BUSCO:
run_BUSCO.py -i /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/target.fa -o TEST -l /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc5.4/openmpi2.0/busco/3.0.2/sample_data/example/ -m genome

sjossey répondue il y a 8 mois

Thank you very much it seems to work on the test ..only thing I changed was while setting environment
export AUGUSTUS_CONFIG_PATH=YOUR_WORKING_DIR/augustus/3.3/config
Sushma

Eloi Mercier répondue il y a 8 mois

I’m glad it works!

Eloi

Eloi Mercier répondue il y a 8 mois



Hi Sushma,
could you show me what command you used to load the module please?
Eloi

sjossey répondue il y a 8 mois



Hi Eloi,
I loaded using
$module load nixpkgs/16.09 gcc/5.4.0 openmpi/2.0.2 busco/3.0.2
Thanks,
Sushma

jhgalvez personnel répondue il y a 8 mois



You need to create the config file, even if it is empty. 
Since you don’t have write access on CVMFS, you need to create it on a directory that you have access to and then define the environmental variable BUSCO_CONFIG_FILE. Otherwise the program won’t work, according to the documentation https://gitlab.com/ezlab/busco
« You can set the BUSCO_CONFIG_FILE
environment variable to define a custom path (including the filename) to the config.ini file,
useful for switching between configurations or in a multi-users environment. »
 
 

Eloi Mercier répondue il y a 8 mois

On top of that, to run BUSCO, you simply need to type:
$ run_BUSCO.py
(after loading the module)

sjossey répondue il y a 8 mois

I will try to do that.
Thanks,
Sushma

jrosner personnel répondue il y a 7 mois



Thread continues in another post

Continued from the question….RUNNING BUSCO ON CEDAR