Document Type
Journal Article
Department/Unit
Department of Mathematics
Language
English
Abstract
Varying-coefficient models are widely used to model nonparametric interaction and recently adopted to analyze longitudinal data measured repeatedly over time. We focus on high-dimensional longitudinal observations in this article. A novel two-step sparse boosting approach is proposed to carry out the variable selection and the model-based prediction. As a new machine learning tool, boosting provides seamless integration of model estimation and variable selection for complicated regression functions. Specifically, in the first step the sparse boosting technique assuming independence is applied to facilitate an initial estimate of the correlation structure while in the second step the estimated correlation structure is incorporated in the loss function of the sparse boosting algorithm. Extensive numerical examples illustrate the advantage of the two-step sparse boosting method. An application of yeast cell cycle gene expression data is further provided to demonstrate the proposed methodology.
Keywords
Sparse boosting, Variable selection, Longitudinal data, Varying-coefficient model, Minimum description length
Publication Date
3-2019
Source Publication Title
Computational Statistics and Data Analysis
Volume
131
Start Page
222
End Page
234
Publisher
Elsevier
DOI
10.1016/j.csda.2018.10.002
Link to Publisher's Edition
https://doi.org/10.1016/j.csda.2018.10.002
ISSN (print)
01679473
ISSN (electronic)
18727352
APA Citation
Yue, M., Li, J., & Cheng, M. (2019). Two-step sparse boosting for high-dimensional longitudinal data with varying coefficients. Computational Statistics and Data Analysis, 131, 222-234. https://doi.org/10.1016/j.csda.2018.10.002