Helmholtz Gemeinschaft

Search
Browse
Statistics
Feeds

Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features

[thumbnail of Original Article]
Preview
PDF (Original Article) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB
[thumbnail of Supplementary Data] Other (Supplementary Data)
4MB

Item Type:Article
Title:Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features
Creators Name:Li, Y., Kuhn, M., Gavin, A.C. and Bork, P.
Abstract:MOTIVATION: Untargeted mass spectrometry is a powerful method for detecting metabolites in biological samples. However, fast and accurate identification of the metabolites' structures from MS/MS spectra is still a great challenge.RESULTS: We present a new analysis method, called SF-Matching, that is based on the hypothesis that molecules with similar structural features will exhibit similar fragmentation patterns. We combine information on fragmentation patterns of molecules with shared substructures and then use random forest models to predict whether a given structure can yield a certain fragmentation pattern. These models can then be used to score candidate molecules for a given mass spectrum. For rapid identification, we pre-compute such scores for common biological molecular structure databases. Using benchmarking datasets, we find that our method has similar performance to CSI:FingerID and that very high accuracies can be achieved by combining our method with CSI:FingerID. Rarefaction analysis of the training dataset shows that the performance of our method will increase as more experimental data become available. AVAILABILITY: SF-Matching is available from http://www.bork.embl.de/Docu/sf_matching. CONTACT: mkuhn@embl.de (M.K.), bork@embl.de (P.B.)
Keywords:Chemical Databases, Factual Databases, Machine Learning, Metabolomics, Tandem Mass Spectrometry
Source:Bioinformatics
ISSN:1367-4803
Publisher:Oxford University Press
Volume:36
Number:4
Page Range:1213-1218
Date:15 February 2020
Official Publication:https://doi.org/10.1093/bioinformatics/btz736
PubMed:View item in PubMed

Repository Staff Only: item control page

Downloads

Downloads per month over past year

Open Access
MDC Library