Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs


High-throughput genomics has revolutionized biological research, however, while the number of sequenced genomes grows by the day, quality assessment of the resulting assembled sequences remains complicated and mostly limited to technical measures like N50.

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

BUSCO assessments are implemented in open-source software, with comprehensive lineage-specific sets of Benchmarking Universal Single-Copy Orthologs for arthropods, vertebrates, metazoans, fungi, eukaryotes, and bacteria.

These conserved orthologs are ideal candidates for large-scale phylogenomics studies, and the annotated BUSCO gene models built during genome assessments provide a comprehensive gene predictor training set for use as part of genome annotation pipelines.

BUSCO assessments offer intuitive metrics, based on evolutionarily informed expectations of gene content from hundreds of species, to gauge completeness of rapidly accumulating genomic data and satisfy an Iberian's quest for quality - "Busco calidad/qualidade".

 

Software & User Guide

    Please consult the User Guide and README for installation requirements and full details on how to perform BUSCO assessments.


Citation

    BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.
    Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov
    Bioinformatics, published online June 9, 2015 | Abstract | Full Text PDF | doi: 10.1093/bioinformatics/btv351
    Supplementary Online Materials: SOM


Datasets

    Download BUSCO profiles (tarzipped files) for:


Assessments

    BUSCO assessments of genome assemblies, gene sets, and transcriptomes for:


Links

    Zdobnov's Computational Evolutionary Genomics Group CEGG and the Hierarchical Catalog of Orthologs OrthoDB