Docker image


In short

docker pull ezlabgva/busco:v5.beta.1_cv1
docker run -u $(id -u) -v $(pwd):/busco_wd ezlabgva/busco:v5.beta.1_cv1

We provide a container that wraps everything required to run a BUSCO analysis. This streamlines the setup and installation and makes it easy to track all software versions used in the analyses (v5.beta.1_cv1 identifies all components at once). It guarantees that only dependency versions compatible with BUSCO are used.

See the Docker userguide for details. Here are suggested examples of how to use the BUSCO container in the non-interactive mode. For more detailed BUSCO usage, see the dedicated section below.

Avoid running containers in root, specify your user uid

docker run -u $(id -u) ezlabgva/busco:v5.beta.1_cv1

You need to use mounts (-v) to exchange files between the host filesystem on which your user can write and the container filesystem at the following location:

  • /busco_wd where inputs and outputs are read and written, and datasets are downloaded
docker run -u $(id -u) -v $(pwd):/busco_wd ezlabgva/busco:v5.beta.1_cv1

The default working directory in the container is /busco_wd. With your non-root uid, attempts to write in the container virtual filesystem will fail if it is not a mount point. You can redefine the working directory using -w to match another mounted folder as current working directory.

# Real BUSCO input will be /home/name/genome.fna
docker run -u $(id -u) -v /home/name/:/busco_wd ezlabgva/busco:v5.beta.1_cv1 busco -i genome.fna

is equivalent to

# Real BUSCO input will be /home/name/genome.fna
docker run -u $(id -u) -v /home/name:/host_mount -w /host_mount ezlabgva/busco:v5.beta.1_cv1 busco -i genome.fna

Be careful not to specify a host folder that does not exist when using -v /hostfolder/:/dockermount. Docker will create it with the root account, which is useless and annoying. It is safer to use the current directory -v $(pwd):/dockermount

To use a custom config.ini file to set run parameters (see below), you need to extract the config.ini that is inside the container, as follows:

docker run -u $(id -u) -v $(pwd):/busco_wd ezlabgva/busco:v5.beta.1_cv1 cat /busco/config/config.ini > myconfig.ini
[edit myconfig.ini]
docker run -u $(id -u) -v $(pwd):/busco_wd ezlabgva/busco:v5.beta.1_cv1 busco -i genome.fna --config=/busco_wd/myconfig.ini

Conda package

Note: BUSCO v5 is not yet available on conda. To install BUSCO v4 please follow the instructions below.
  1. Ensure sure you have conda version 4.8.4 or higher. Enter conda -V to check. If necessary, update conda by entering conda update -n base conda.

  2. To install BUSCO in the current environment, enter

    conda install -c bioconda -c conda-forge busco=4.1.4
  3. Alternatively you can create a new environment with BUSCO installed

    conda create -n <your_env_name> -c bioconda -c conda-forge busco=4.1.4
    conda activate <your_env_name>

Manual installation

Supported OS

BUSCO is being developed and tested on multiple distributions of Linux (e.g. Arch Linux, CentOS, Ubuntu). We do not support MacOS but BUSCO should work on it, although Augustus seems to cause troubles on some BSD-derived systems, including MacOS. Consider the Docker container if you work on incompatible environments.

Third-party components

A full installation of BUSCO requires Python 3.3+ (2.7 is not supported from v4 onwards), BioPython, tBLASTn 2.2+, Augustus 3.2, Prodigal, Metaeuk, HMMER3.1+, SEPP, and R + ggplot2 for the plotting companion script. Some of these tools are necessary only for analysing certain type of organisms and input data, or for specific run modes.

Please make sure that each software package listed above works INDEPENDENTLY of BUSCO before attempting to run any BUSCO assessments.

Relevant parameters given to BUSCO are propagated to third-party tools (e.g. nb of cpus). Their default configuration should not be changed for BUSCO runs.

For Augustus, the option --augustus_parameters (see below) allows advanced users to freely pass parameters. Use it only to fix biologically relevant parameters such as the translation table and mention these when reporting the result. Do not use it for selecting the Augustus species, as there is a dedicated BUSCO parameter, and do not use it to specify the number of cpus.


Augustus uses several executables and PERL scripts. Please refer to Augustus documentation for PERL requirements.

In addition to the entries in the config.ini file, Augustus requires environment variables to be declared as follows:

export PATH="/path/to/AUGUSTUS/augustus-3.2.3/bin:$PATH"
export PATH="/path/to/AUGUSTUS/augustus-3.2.3/scripts:$PATH"
export AUGUSTUS_CONFIG_PATH="/path/to/AUGUSTUS/augustus-3.2.3/config/"

NB: you can use the printenv command to view all your environment settings.

Known bugs and unsupported versions

tBLASTn 2.4+

During development we recognised and issue with tBLASTn versions 2.4-2.10.0 when using more than one CPU. NCBI issued a fix for the problem in version 2.10.1+. Make sure you have at least version 2.10.1+ installed.


BUSCO v4 introduces SEPP, which rely on pplacer. Our experience regarding the installation of this third-party software is limited. Please report any malfunctioning version.


BUSCO requires the version of metaeuk installed directly from Github (link above). The available conda version is not compatible with the BUSCO pipeline. We have an open issue with the Metaeuk developers so we hope to resolve this before the full v5 release.

BUSCO code and config file

git clone
cd busco/
Note: v4.1.4 is the latest stable release. To access v5.beta clone this repository and checkout the v5 branch with git checkout v5.beta
sudo python3


python3 install --user

NB: If you are running BUSCO in a Python virtual environment, the following should suffice:

python3 install

To guarantee that the correct version of every third-party component is identified, BUSCO employs a user-editable config file for defining required settings. In the config/ subfolder of the cloned repository, the config.ini file must be edited. In this file, you must declare the paths to all third party components matching what is on your machine.

The script

./scripts/ config/config.ini config/myconfig.ini

can do it for you if all executable are declared in $PATH

# for instance
path = /usr/local/bin/
command = hmmsearch

You have to set the environment variable BUSCO_CONFIG_FILE with the path to the file, for BUSCO to be able to locate it.

export BUSCO_CONFIG_FILE="/path/to/myconfig.ini"

See also an alternative


  • How to solve: ERROR Cannot write to Augustus species folder ...

This is because during genome mode assessments Augustus needs to write gene model prediction parameters to its own "config" directory, and if you do not have write access to this directory the analysis will fail. If Augustus is installed globally on your system and you do not have administrator rights there is a simple workaround that should work on most systems. Simply recursively copy the entire Augustus "config" directory to a location where you do have write access, and then set the AUGUSTUS_CONFIG_PATH variable to this location.

cp -r /path/to/AUGUSTUS/augustus-3.2.3/config /my/home/augustus/config
export AUGUSTUS_CONFIG_PATH="/my/home/augustus/config/"

For the rest, the BUSCO issues board is the best place to find a solution to your problem.

Running BUSCO

Mandatory arguments


Mandatory arguments unless provided in the config file

-i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode.

-o or --out defines the folder that will contain all results, logs, and intermediate data

-m or --mode sets the assessment MODE: genome, proteins, transcriptome

-l or --lineage_dataset

It can be a dataset name, i.e. bacteria_odb10, or a path i.e. ./bacteria_odb10 or /home/user/bacteria_odb10. In the former case, which is the recommended usage, BUSCO will automatically download and version the corresponding dataset. In the latter case, the dataset found in the given path will be used. Lineage can be ignored if running automated lineage selection

Manage run parameters in config files

If you do not want to set the config file location in BUSCO_CONFIG_FILE to have more flexibility, you can specify the path to the config file by using the option --config when running BUSCO.

busco --config /path/to/filename.ini

This is useful for switching between configurations or manage parameters for each run in a dedicated file.

All of the command line parameters have an equivalent in the config file under the section [busco_run], usually --param value corresponds to param=value, --param corresponds to param=True. Providing input parameters through the command line will override those defined in config.ini.

Lineage datasets

BUSCO employs clade specific information to identify BUSCO genes in the analysed sequence. It can be specified by the user or selected automatically in the case of bacteria and archaea.

To print the full list: busco --list-datasets

They are organised in folders that should contain:

📁 hmms HMM file for each BUSCO
📁 info Files with lists of species, genes, ortho-groups, and annotations
📁 prfl Block profile file for each BUSCO
📎 ancestral FASTA file, consensus ancestral sequences for each BUSCO
📎 ancestral_variants FASTA file, consensus & variant sequences for each BUSCO
📎 dataset.cfg Configuration data including default Augustus species
📎 lengths_cutoff Length cut-offs for complete BUSCO matches
📎 scores_cutoff Score cut-offs for orthologous BUSCO matches
📎 links_to_ODB10.txt Annotations and links to OrthoDB for each gene
📎 missing_in_parasitic.txt List of genes present in lineages containing clades with reduced parasitic genomes (e.g. fungi_odb10)


The execution of the BUSCO assessment pipeline will create a directory named after your assigned name for the run (set with the -o OUTPUT_NAME mandatory option).

This directory will contain several files and directories


📁 run_lineage_name This is the main result folder corresponding to the final evaluation.
📎 short_summary.*.txt Contains a plain text summary of the results in BUSCO notation.
📎 full_table.tsv Contains the complete results in a tabular format with scores and lengths of BUSCO matches, and coordinates (for genome mode) or gene/protein IDs (for transcriptome or protein mode).
📎 missing_busco_list.tsv Contains a list of missing BUSCOs.
📁 busco_sequences FASTA format file for each BUSCO gene identified. .faa files contain protein sequences .fna files contain coding sequences.
📁 hmmer_output Tabular format HMMER output of searches with BUSCO HMMs.
📁 auto_lineage Result folders produced during the automated lineage selection process that were not retained. They represent either wrong or less specific choices of lineage
📁 logs Contains a detailed busco.log file (with DEBUG notifications) and the stderr and stdout of each third party software
📁 prodigal_output Results of the Prodigal gene predictor, shared by all non-eukaryotic runs in the analysis
📁 predicted genes A nucleotide and protein file for each predicted gene
📁 metaeuk_output Results of the Metaeuk gene predictor, for the eukaryotic genome runs
📁 predicted_genes Augustus raw gene output
📁 extracted_proteins Augustus protein FASTA output
📁 retraining_parameters BUSCO retraining. Specific to your species.
📁 gb GenBank format complete BUSCOs before retraining
📁 gff General Feature Format complete BUSCOs before retraining
📎 training_set.db Genes used for Augustus retraining
📁 blast_output Results of the tBLASTn alignment tool, for the eukaryotic genome runs and transcriptome mode
📎 tblastn.tsv tabular tBLASTn results
📎 coordinates.tsv locations of BUSCO matches (eukaryotic genome)
📎 tblastn_missing_and_frag_rerun.tsv tabular tBLASTn results during the 2nd phase (eukaryotic genome)
📎 coordinates_missing_and_frag_rerun.tsv locations of BUSCO matches during the 2nd phase (eukaryotic genome)
📁 db Blast database
📁 sequences Sequences having blast results
📁 translated_sequences Six frame translations of each transcript made by the transcriptome mode. It is a naive translation, ignoring start and stop codons only in order to apply hmmsearch and do not represent proteins

Genome mode: assessing a genome assembly

Requires tBLASTn, Prodigal (for non-eukaryotes) or Metaeuk (for eukaryotes), HMMER. Metaeuk can be replaced with Augustus if you enter the --augustus command line flag.

busco -m genome -i INPUT.nucleotides -o OUTPUT -l LINEAGE

Protein mode: assessing a gene set

Requires HMMER

busco -m protein -i INPUT.amino_acids -o OUTPUT -l LINEAGE

Transcriptome mode: assessing assembled transcripts

Requires tBLASTn, HMMER

busco -m transcriptome -i INPUT.nucleotides -o OUTPUT -l LINEAGE

Automated lineage selection

Requires SEPP

busco -m MODE -i INPUT -o OUTPUT --auto-lineage

or ignoring eukaryotes to save runtime, if compatible with your experimental goal.

busco -m MODE -i INPUT -o OUTPUT --auto-lineage-prok

or ignoring non-eukaryotes to save runtime, if compatible with your experimental goal.

busco -m MODE -i INPUT -o OUTPUT --auto-lineage-euk

Download and automated update

BUSCO can obtain the last version of the lineage datasets. If the name of a dataset is passed, e.g. -l bacteria_odb10, BUSCO will download it automatically. If a full path is given to BUSCO using -l /my/own/path/bacteria_odb10, this automated management will be disabled. Otherwise, files used during the automated lineage selection process are also automatically obtained by BUSCO.

By default, if a new version of a file is available, BUSCO will warn you. If you pass the argument --update-data, busco will replace the current file or folder with the up to date version and archive the previous one.


If you are running BUSCO in an environment that does not see the Internet, you can pass the --offline argument to prevent BUSCO from attempting any download. You will have to download and unpack all files manually from and place them in the BUSCO download folder (whose location is defined in the config.ini file).

A valid download folder looks as follows:

|-- information
|   `-- lineages_list.2019-11-27..txt
|-- lineages
|   |-- acidobacteria_odb10
|   |-- actinobacteria_class_odb10
|   |-- actinobacteria_phylum_odb10
|   |-- actinopterygii_odb10
|   |-- tissierellia_odb10
|   |-- tremellomycetes_odb10
|   |-- vibrionales_odb10
|   `-- xanthomonadales_odb10
`-- placement_files
    |-- list_of_reference_markers.archaea_odb10.2019-12-16..txt
    |-- mapping_taxid-lineage.archaea_odb10.2019-12-16..txt
    |-- mapping_taxids-busco_dataset_name.archaea_odb10.2019-12-16..txt
    |-- supermatrix.aln.archaea_odb10.2019-12-16..faa
    |-- tree.archaea_odb10.2019-12-16..nwk
    `-- tree_metadata.archaea_odb10.2019-12-16..txt

Additional options

busco -h will display other useful arguments that can also be declared under the [busco_run] section of the config file.

                        Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set.
  -o OUTPUT, --out OUTPUT
                        Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path
  -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
                        There are three valid modes:
                        - geno or genome, for genome assemblies (DNA)
                        - tran or transcriptome, for transcriptome assemblies (DNA)
                        - prot or proteins, for annotated gene sets (protein)
  -l LINEAGE, --lineage_dataset LINEAGE
                        Specify the name of the BUSCO lineage to be used.
  --auto-lineage        Run auto-lineage to find optimum lineage path
  --auto-lineage-prok   Run auto-lineage just on non-eukaryote trees to find optimum lineage path
  --auto-lineage-euk    Run auto-placement just on eukaryote tree to find optimum lineage path
  -c N, --cpu N         Specify the number (N=integer) of threads/cores to use.
  -f, --force           Force rewriting of existing files. Must be used when output files with the provided name already exist.
  -r, --restart         Continue a run that had already partially completed.
  -q, --quiet           Disable the info logs, displays only errors
  --out_path OUTPUT_PATH
                        Optional location for results folder, excluding results folder name. Default is current working directory.
  --download_path DOWNLOAD_PATH
                        Specify local filepath for storing BUSCO dataset downloads
  --datasets_version DATASETS_VERSION
                        Specify the version of BUSCO datasets, e.g. odb10
  --download_base_url DOWNLOAD_BASE_URL
                        Set the url to the remote BUSCO dataset location
  --update-data         Download and replace with last versions all lineages datasets and files necessary to their automated selection
  --offline             To indicate that BUSCO cannot attempt to download files
  --metaeuk_parameters METAEUK_PARAMETERS
                        Pass additional arguments to Metaeuk for the first run. All arguments should be contained within a single pair of quotation marks, separated by commas. E.g. "--param1=1,--param2=2"
  --metaeuk_rerun_parameters METAEUK_RERUN_PARAMETERS
                        Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single pair of quotation marks, separated by commas. E.g. "--param1=1,--param2=2"
  -e N, --evalue N      E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
  --limit REGION_LIMIT  How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
  --augustus            Use augustus gene predictor for eukaryote runs
  --augustus_parameters AUGUSTUS_PARAMETERS
                        Pass additional arguments to Augustus. All arguments should be contained within a single pair of quotation marks, separated by commas. E.g. "--param1=1,--param2=2"
  --augustus_species AUGUSTUS_SPECIES
                        Specify a species for Augustus training.
  --long                Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms
  --config CONFIG_FILE  Provide a config file
  -v, --version         Show this version and exit
  -h, --help            Show this help message and exit
  --list-datasets       Print the list of available BUSCO datasets

An example of `--metaeuk_parameters` use


Interpreting the results

BUSCO attempts to provide a quantitative assessment of the completeness in terms of expected gene content of a genome assembly, transcriptome, or annotated gene set. The results are simplified into categories of Complete and single-copy, Complete and duplicated, Fragmented, or Missing BUSCOs.

BUSCO completeness results make sense only in the context of the biology of your organism. You have to understand whether missing or duplicated genes are of biological or technical origin. For instance, a high level of duplication may be explained by a recent whole duplication event (biological) or a chimeric assembly of haplotypes (technical). Transcriptomes and protein sets that are not filtered for isoforms will lead to a high proportion of duplicates. Therefore you should filter them before a BUSCO analysis. Finally, focusing on specific tissues or specific life stages and conditions in a transcriptomic experiment is unlikely to produce a BUSCO-complete transcriptome. In this case, consistency across your samples is what you will be aiming for.

If you need help or suggestions, use the BUSCO issues board to exchange with other users.


If found to be complete, whether single-copy or duplicated, the BUSCO matches have scored within the expected range of scores and within the expected range of length alignments to the BUSCO profile. If in fact an ortholog is not present in the input dataset, or the ortholog is only partially present (highly fragmented), and a high-identity full-length homolog is present, it is possible that this homolog could be mistakenly identified as the complete BUSCO. The score thresholds are optimised to minimise this possibility, but it can still occur.


If found to be fragmented, the BUSCO matches have scored within the range of scores but not within the range of length alignments to the BUSCO profile. For transcriptomes or annotated gene sets this indicates incomplete transcripts or gene models. For genome assemblies this could indicate either that the gene is only partially present or that the sequence search and gene prediction steps failed to produce a full-length gene model even though the full gene could indeed be present in the assembly. When running eukaryotic datasets, matches that produce such fragmented results are given a "second chance" with a second round of sequence searches and gene predictions with parameters trained on those BUSCOs that were found to be complete, but this can still fail to recover the whole gene. Some fragmented BUSCOs from genome assembly assessments could therefore be complete but are just too divergent or have very complex gene structures, making them very hard to locate and predict in full.


If found to be missing, there were either no significant matches at all, or the BUSCO matches scored below the range of scores for the BUSCO profile. For transcriptomes or annotated gene sets this indicates that these orthologs are indeed missing or the transcripts or gene models are so incomplete/fragmented that they could not even meet the criteria to be considered as fragmented. For genome assemblies this could indicate either that these orthologs are indeed missing, or that the sequence search step failed to identify any significant matches, or that the gene prediction step failed to produce even a partial gene model that might have been recognised as a fragmented BUSCO match. Like for fragments, when running eukaryotic datasets, BUSCOs missing after the first round are given a "second chance" with a second round of sequence searches and gene predictions with parameters trained on those BUSCOs that are complete, but this can still fail to recover the gene. Some missing BUSCOs from genome assembly assessments could therefore be partially present, and even possibly (but unlikely) complete, but they are just too divergent or have very complex gene structures, making them very hard to locate and predict correctly or even partially.

Automated selection: matches in multiple domains and contamination.

The automated lineage selection process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement. Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.

Best practices

Some common sense advice on how to run BUSCO assessments, as well as on how to report BUSCO findings in publications etc. to make sure they are both interpretable and reproducible.

Running BUSCO

  • BUSCO has been tested with Augustus 3.2.x. For now, we strongly recommend to use version 3.2.3.
  • Generally the lineage to select for your assessments should be the most specific lineage available, e.g. for assessing fish data one would select the actinopterygii lineage rather than the metazoa lineage.

A full list of available BUSCO datasets can be obtained by entering busco --list-datasets

  • If you are assessing a large number of species/strains/versions etc. then to minimise runtime (at the expense of resolution) one might select a less specific lineage set with fewer BUSCOs, e.g. for assessing 20 bird genomes each with a couple of different assembly versions one might select the vertebrata or the metazoa lineages rather than the aves lineage, at least for the initial rounds of assessments.
  • Assessments generally produce several folders with lots of files. These are for your benefit, so that you can examine individual cases in more detail and/or use the data for downstream analyses. Once you are done with them it would be a good idea to compress/tarball them for archiving. If you are running many assessments it might be a good idea to compress/tarball the results folders that contain many files as each run finishes using the -z, --tarzip option (this has not yet been re-implemented in the 4.1.0 release).
  • Please do take some time to check the log files, these are there for your benefit in order to highlight potential problems that may have occurred during your BUSCO run.
  • Compare the results from assessing your data with like-for-like assessments of corresponding publicly available data for other closely-related species. In this way, the BUSCO results can be used to claim that your dataset is as good as, or better than, existing publicly available datasets for similar species.
  • If manual curation of annotated gene sets was performed, report BUSCO results before and after curation to quantify improvements.

Reporting BUSCO

  • Report results in simple BUSCO notation: C:89.0%[S:85.8%,D:3.2%],F:6.9%,M:4.1%,n:3023
  • Use the (see below) script to produce simple graphical summaries (that are easily customisable) for your publication’s supporting online information.
  • Report the versions you used for all third party components. We highly recommand using the BUSCO container, whose version is sufficent to safely reproduce a run.
  • Report the BUSCO set(s) you used for your assessments. Mention the creation date of the dataset, not only the name, e.g. archaea_odb10 (2019-01-04).
  • Report the BUSCO options you used.
  • Report the version(s) of the genome assembly, annotated gene set, or transcriptome that you assessed.

Companion scripts


The scripts/ script allows users to quickly view their BUSCO summary results in an easily-understandable bar chart. The scripts/ uses R ( and ggplot2 ( to summarise BUSCO runs for side-by-side comparisons. The script produces a PNG image (if both R and ggplot2 are available), as well as an R source code file that can be used to run on a different machine where both R and ggplot2 are available or which can be edited to fully customise the resulting bar chart (colours, labels, fonts, axes, etc.).


BUSCO plot generation tool.
Place all BUSCO short summary files (short_summary.[generic|specific].dataset.label.txt) in a single folder. It will be your working directory, in which the generated plot files will be written
See also the user guide for additional information

required arguments:
  -wd PATH, --working_directory PATH
                        Define the location of your working directory

optional arguments:
  -rt RUN_TYPE, --run_type RUN_TYPE
                        type of summary to use, `generic` or `specific`
  --no_r                To avoid to run R. It will just create the R script file in the working directory
  -q, --quiet           Disable the info logs, displays only errors
  -h, --help            Show this help message and exit

To run scripts/, first create a folder, e.g. mkdir BUSCO_summaries, and then copy the BUSCO short summary file from each of the runs you want to plot into this folder.

cp XX1/short_summary.*.lineage_odb10.XX1.txt BUSCO_summaries/.
cp XX2/short_summary.*.lineage_odb10.XX2.txt BUSCO_summaries/.
cp XX3/short_summary.*.lineage_odb10.XX3.txt BUSCO_summaries/.

Then simply run the script giving as argument the name (or full path if you are not in same working directory) of the folder you created containing the summaries you wish to plot.

python3 scripts/ –wd BUSCO_summaries
python3 scripts/ –wd /full/path/to/my/folder/BUSCO_summaries

The resulting PNG image and the corresponding R source code file will be produced in the same folder containing the BUSCO summaries. By default, the run name is used as the label for each plotted result, and this is automatically extracted from the short summary file name: so for short_summary.generic.lineage_odb10.XX1.txt the label would be XX1. You can modify this as long as you keep the naming convention: short_summary.generic.lineage_odb10.[edit_name_here].txt or you can simply edit the R source code file to change any plotting parameters and produce a personalised bar chart running the code manually in your R environment.

Example scripts/ bar chart:

mkdir my_summaries
cp SPEC1/short_summary.generic.lineage1_odb10.SPEC1.txt my_summaries/.
cp SPEC2/short_summary.generic.lineage2_odb10.SPEC2.txt my_summaries/.
cp SPEC3/short_summary.specific.lineage2_odb10.SPEC3.txt my_summaries/.
cp SPEC4/short_summary.generic.lineage3_odb10.SPEC4.txt my_summaries/.
cp SPEC5/short_summary.generic.lineage4_odb10.SPEC5.txt my_summaries/.
python3 scripts/ –wd my_summaries

BUSCO plot|5x5


The repository contains the script that was used to produce the phylogenomics portion of the BUSCO v3 (PMID: 29220515) paper. It has not been ported to BUSCO v4.

Discussion board and support

If you need help, check first the BUSCO issues board. You can also write to our support: support[at]



Can we compare different BUSCO lineages, are they nested?