Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    CodingQuarry: Highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts

    226484_155371_codingquarry.pdf (698.9Kb)
    Access Status
    Open access
    Authors
    Testa, Alison
    Hane, James
    Ellwood, Simon
    Oliver, Richard
    Date
    2015
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Testa, A. and Hane, J. and Ellwood, S. and Oliver, R. 2015. CodingQuarry: Highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts. BMC Genomics. 16 (170).
    Source Title
    BMC Genomics
    DOI
    10.1186/s12864-015-1344-4
    ISSN
    1471-2164
    School
    Department of Environment and Agriculture
    Remarks

    This open access article is distributed under the Creative Commons license http://creativecommons.org/licenses/by/4.0/

    URI
    http://hdl.handle.net/20.500.11937/38071
    Collection
    • Curtin Research Publications
    Abstract

    Background: The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. Results: CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against whole genome Sc. pombe and S. cerevisiae annotations further substantiate a 4-5% improvement in the number of correctly predicted genes. Conclusions: We demonstrate the success of a novel method of incorporating RNA-seq data into GHMM fungal gene prediction. This shows that a high quality annotation can be achieved without relying on protein homology or a training set of genes. CodingQuarry is freely available (https://sourceforge.net/projects/codingquarry/), and suitable for incorporation into genome annotation pipelines.

    Related items

    Showing items related by title, author, creator and subject.

    • Deep proteogenomics; high throughput gene validation by multidimensional liquid chromatography and mass spectrometry of proteins from the fungal wheat pathogen Stagonospora nodorum
      Bringans, S.; Hane, J.; Casey, T.; Tan, Kar-Chun; Lipscombe, R.; Solomon, P.; Oliver, Richard (2009)
      Background: Stagonospora nodorum, a fungal ascomycete in the class dothideomycetes, is a damaging pathogen of wheat. It is a model for necrotrophic fungi that cause necrotic symptoms via the interaction of multiple effector ...
    • Deep proteogenomics; high throughput gene validation by multidimensional liquid chromatography . and mass spectrometry of proteins from the fungal wheat pathogen Stagonospora nodorum
      Bringans, S.; Hane, J.; Casey, T.; Tan, Kar-Chun; Lipscombe, R.; Solomon, P.; Oliver, Richard (2009)
      Background: Stagonospora nodorum, a fungal ascomycete in the class dothideomycetes, is a damaging pathogen of wheat. It is a model for necrotrophic fungi that cause necrotic symptoms via the interaction of multiple effector ...
    • Complete genome sequence of Sporisorium scitamineum and biotrophic interaction transcriptome with sugarcane
      Taniguti, L.; Schaker, P.; Benevenuto, J.; Peters, L.; Carvalho, G.; Palhares, A.; Quecine, M.; Nunes, F.; Kmit, M.; Wai, A.; Hausner, G.; Aitken, K.; Berkman, P.; Fraser, J.; Moolhuijzen, Paula; Coutinho, L.; Creste, S.; Vieira, M.; Kitajima, J.; Monteiro-Vitorello, C. (2015)
      Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.