Helmholtz Gemeinschaft

Search
Browse
Statistics
Feeds

InParanoid 7: new algorithms and tools for eukaryotic orthology analysis

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
5MB

Item Type:Article
Title:InParanoid 7: new algorithms and tools for eukaryotic orthology analysis
Creators Name:Ostlund, G. and Schmitt, T. and Forslund, K. and Köstler, T. and Messina, D.N. and Roopra, S. and Frings, O. and Sonnhammer, E.L.L.
Abstract:The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site (http://InParanoid.sbc.su.se). Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters.
Keywords:Algorithms, Bacterial Genome, Cluster Analysis, Computational Biology, Escherichia coli, Eukaryotic Cells, Genetic Databases, Information Storage and Retrieval, Internet, Nucleic Acid Databases, Proteins, Proteomics, Software, Tertiary Protein Structure, Animals
Source:Nucleic Acids Research
ISSN:0305-1048
Publisher:Oxford University Press
Volume:38
Number:Suppl 1
Page Range:D196-D203
Date:1 January 2010
Official Publication:https://doi.org/10.1093/nar/gkp931
PubMed:View item in PubMed

Repository Staff Only: item control page

Downloads

Downloads per month over past year

Open Access
MDC Library