Helmholtz Gemeinschaft


Evaluation of annotation strategies using an entire genome sequence

Item Type:Article
Title:Evaluation of annotation strategies using an entire genome sequence
Creators Name:Iliopoulos, I. and Tsoka, S. and Andrade, M.A. and Enright, A.J. and Carroll, M. and Poullet, P. and Promponas, V. and Liakopoulos, T. and Palaios, G. and Pasquier, C. and Hamodrakas, S. and Tamames, J. and Yagnik, A.T. and Tramontano, A. and Devos, D. and Blaschke, C. and Valencia, A. and Brett, D. and Martin, D. and Leroy, C. and Rigoutsos, I. and Sander, C. and Ouzounis, C.A.
Abstract:Motivation: Genome-wide functional annotation either by manual or automatic means has raised considerable concerns regarding the accuracy of assignments and the reproducibility of methodologies. In addition, a performance evaluation of automated systems that attempt to tackle sequence analyses rapidly and reproducibly is generally missing. In order to quantify the accuracy and reproducibility of function assignments on a genome-wide scale, we have re-annotated the entire genome sequence of Chlamydia trachomatis (serovar D), in a collaborative manner. Results: We have encoded all annotations in a structured format to allow further comparison and data exchange and have used a scale that records the different levels of potential annotation errors according to their propensity to propagate in the database due to transitive function assignments. We conclude that genome annotation may entail a considerable amount of errors, ranging from simple typographical errors to complex sequence analysis problems. The most surprising result of this comparative study is that automatic systems might perform as well as the teams of experts annotating genome sequences.
Keywords:Amino Acid Sequence, Bacterial Genome, Bacterial Proteins, Chlamydia Trachomatis, Database Management Systems, Documentation, Gene Expression Profiling, Genetic Databases, Genome, Information Storage and Retrieval, Molecular Sequence Data, Protein Databases, Reproducibility of Results, Sensitivity and Specificity
Publisher:Oxford University Press
Page Range:717-726
Date:12 April 2003
Official Publication:https://doi.org/10.1093/bioinformatics/btg077
PubMed:View item in PubMed

Repository Staff Only: item control page

Open Access
MDC Library