×

Functional linear regression that’s interpretable. (English) Zbl 1171.62041

Summary: Regression models to relate a scalar \(Y\) to a functional predictor \(X(t)\) are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, \(\beta (t)\), with \(Y\) related to \(X(t)\) through \(\int \beta (t)X(t) \,dt\). Regions where \(\beta (t)\neq 0\) correspond to places where there is a relationship between \(X(t)\) and \(Y\). Alternatively, points where \(\beta (t)=0\) indicate no relationship. Hence, for interpretation purposes, it is desirable for a regression procedure to be capable of producing estimates of \(\beta (t)\) that are exactly zero over regions with no apparent relationship and have simple structures over the remaining regions.
Unfortunately, most fitting procedures result in an estimate for \(\beta (t)\) that is rarely exactly zero and has unnatural wiggles making the curves hard to interpret. We introduce a new approach which uses variable selection ideas, applied to various derivatives of \(\beta (t)\), to produce estimates that are both interpretable, flexible and accurate. We call our method “Functional Linear Regression That’s Interpretable” (FLiRTI) and demonstrate it on simulated and real-world data sets. In addition, non-asymptotic theoretical bounds on the estimation error are presented. The bounds provide strong theoretical motivation for our approach.

MSC:

62J05 Linear regression; mixed models
62G08 Nonparametric regression and quantile regression
62J12 Generalized linear models (logistic models)
62J99 Linear inference, regression
65C60 Computational problems in statistics (MSC2010)
62P12 Applications of statistics to environmental and related topics

Software:

fda (R); Fahrmeir; PDCO

References:

[1] Candes, E. and Tao, T. (2005). Decoding by linear programming. IEEE Trans. Inform. Theory 51 4203-4215. · Zbl 1264.94121 · doi:10.1109/TIT.2005.858979
[2] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n (with discussion). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[3] Chen, S., Donoho, D. and Saunders, M. (1998). Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20 33-61. · Zbl 0919.94002 · doi:10.1137/S1064827596304010
[4] Efron, B., Hastie, T., Johnston, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407-451. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[5] Fahrmeir, L. and Tutz, G. (1994). Multivariate Statistical Modeling Based on Generalized Linear Models . Springer, New York. · Zbl 0809.62064
[6] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[7] Fan, J. and Zhang, J. (2000). Two-step estimation of functional linear models with applications to longitudinal data. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 303-322. · doi:10.1111/1467-9868.00233
[8] Faraway, J. (1997). Regression analysis for a functional response. Technometrics 39 254-261. · Zbl 0891.62027 · doi:10.2307/1271130
[9] Ferraty, F. and Vieu, P. (2002). The functional nonparametric model and application to spectrometric data. Comput. Statist. 17 545-564. · Zbl 1037.62032 · doi:10.1007/s001800200126
[10] Ferraty, F. and Vieu, P. (2003). Curves discrimination: a nonparametric functional approach. Comput. Statist. Data Anal. 44 161-173. · Zbl 1429.62241
[11] Hastie, T. and Mallows, C. (1993). Comment on “a statistical view of some chemometrics regression tools.” Technometrics 35 140-143. · Zbl 0775.62288 · doi:10.2307/1269656
[12] Hoover, D. R., Rice, J. A., Wu, C. O. and Yang, L. P. (1998). Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 85 809-822. · Zbl 0921.62045 · doi:10.1093/biomet/85.4.809
[13] James, G. M. (2002). Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 411-432. · Zbl 1090.62070 · doi:10.1111/1467-9868.00342
[14] James, G. M. and Hastie, T. J. (2001). Functional linear discriminant analysis for irregularly sampled curves. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 533-550. · Zbl 0989.62036 · doi:10.1111/1467-9868.00297
[15] James, G. M. and Radchenko, P. (2009). A generalized dantzig selector with shrinkage tuning. Biometrika . · Zbl 1163.62054 · doi:10.1093/biomet/asp013
[16] James, G. M., Radchenko, P. and Lv, J. (2009). DASSO: Connections between the dantzig selector and lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 127-142. · Zbl 1231.62129 · doi:10.1111/j.1467-9868.2008.00668.x
[17] James, G. M. and Silverman, B. W. (2005). Functional adaptive model estimation. J. Amer. Statist. Assoc. 100 565-576. · Zbl 1117.62364 · doi:10.1198/016214504000001556
[18] Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13-22. · Zbl 0595.62110 · doi:10.1093/biomet/73.1.13
[19] Lin, D. Y. and Ying, Z. (2001). Semiparametric and nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 96 103-113. · Zbl 1015.62038 · doi:10.1198/016214501750333018
[20] Lu, Y. and Zhang, C. (2008). Spatially adaptive functional linear regression with functional smooth lasso.
[21] Muller, H. G. and Stadtmuller, U. (2005). Generalized functional linear models. Ann. Statist. 33 774-805. · Zbl 1068.62048 · doi:10.1214/009053604000001156
[22] Radchenko, P. and James, G. M. (2008). Variable inclusion and shrinkage algorithms. J. Amer. Statist. Assoc. 103 1304-1315. · Zbl 1205.62100 · doi:10.1198/016214508000000481
[23] Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis . Springer, New York. · Zbl 1011.62002 · doi:10.1007/b98886
[24] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis , 2nd ed. Springer, New York. · Zbl 1079.62006
[25] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[26] Tibshirani, R., Saunders, M., Rosset, S. and Zhu, J. (2005). Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 91-108. · Zbl 1060.62049 · doi:10.1111/j.1467-9868.2005.00490.x
[27] Valdes-Sosa, P. A., Sanchez-Bornot, J, M., Lage-Castellanos, A., Vega-Hernandez, M., Bosch-Bayard, J., Melie-Garcia, L. and Canales-Rodriguez, E. (2005). Estimating brain functional connectivity with sparse multivariate autoregression. Philos. Trans. R. Soc. Ser. B 360 969-981.
[28] Wu, C. O., Chiang, C. T. and Hoover, D. R. (1998). Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data. J. Amer. Statist. Assoc. 93 1388-1402. · Zbl 1064.62523 · doi:10.2307/2670054
[29] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301-320. · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.