Helmholtz Gemeinschaft

Search
Browse
Statistics
Feeds

Tailoring sparse multivariable regression techniques for prognostic single-nucleotide polymorphism signatures

Item Type:Article
Title:Tailoring sparse multivariable regression techniques for prognostic single-nucleotide polymorphism signatures
Creators Name:Binder, H., Benner, A., Bullinger, L. and Schumacher, M.
Abstract:When seeking prognostic information for patients, modern technologies provide a huge amount of genomic measurements as a starting point. For single-nucleotide polymorphisms (SNPs), there may be more than one million covariates that need to be simultaneously considered with respect to a clinical endpoint. Although the underlying biological problem cannot be solved on the basis of clinical cohorts of only modest size, some important SNPs might still be identified. Sparse multivariable regression techniques have recently become available for automatically identifying prognostic molecular signatures that comprise relatively few covariates and provide reasonable prediction performance. For illustrating how such approaches can be adapted to the specific features of SNP data, we propose different variants of a component-wise likelihood-based boosting approach. The latter links SNP measurements to a time-to-event endpoint by a regression model that is built up in a large number of steps. The variants allow for strategic choices in dealing with SNPs that differ in variance because of their variation in minor allele frequencies. In addition, we propose a heuristic that allows computationally efficient handling of millions of covariates, thus opening the door for incorporating SNP × treatment interactions. We illustrate this using data from patients with acute myeloid leukemia. We judge the resulting models according to prediction error curves and using resampling data sets. We obtain increased stability by moving interpretation from the SNP to the gene level. By considering these different aspects, we outline a more general strategy for linking SNP measurements to a time-to-event endpoint by means of sparse multivariable regression models.
Keywords:Risk Prediction, Multivariable Model, Boosting, Single-Nucleotide Polymorphisms, Treatment, Interaction
Source:Statistics in Medicine
ISSN:1097-0258
Publisher:Wiley
Volume:32
Number:10
Page Range:1778-91
Date:10 May 2013
Official Publication:https://doi.org/10.1002/sim.5490
PubMed:View item in PubMed

Repository Staff Only: item control page

Open Access
MDC Library