×

Regression and time series model selection. (English) Zbl 0907.62095

Singapore: World Scientific Publishing. xxii, 455 p. (1998).
The selection of an appropriate model from a large class of potential candidates plays the crucial role in regression analysis and in time series modelling. The important landmarks in this field are the famous Akaike’s FPE and AIC criteria. In the last two decades many papers were published bringing new criteria and investigating statistical properties of proposed procedures. There are two main frameworks in which the theory was developed. First, it is assumed that the true model of finite dimension is included among candidates. The goal is to choose correctly this true model. Here a consistent criterion is preferred, which identifies the true model asymptotically with probability one. The second approach is based on the assumption that the true model (usually of infinite dimension) is not among the candidates. Then an asymptotically efficient criterion is used which chooses in large samples the model with minimum mean squared error distribution.
The authors start with the univariate regression model and discuss the use of the signal-to-noise ratio as a descriptive statistic for evaluating model selection criteria. Some corrected variants and their asymptotic probabilities of overfitting are described. Then the univariate autoregressive model is analyzed in detail. The derivation of some of popular criteria (AIC, AICc, AICu, FPE, FPEu, Cp, SIC, HQ, HQc) is sketched and their small sample properties as well as asymptotic results are derived. The next chapters are devoted to multivariate regression models and vector autoregressive models.
Other methods for choosing an appropriate model are based on cross-validation and bootstrap. The cross-validation criterion CV(1) is asymptotically equivalent to the FPE and computationally much less demanding than bootstrapping, although the authors found that the number of bootstrap replications can be unexpectedly low. If the model does not satisfy the assumption of normality, robust techniques are considered. Simulations show that naive implementations of existing criteria still perform well. Some explicit results are derived for \(L_1\) regression, which correspond to additive errors with double exponential (Laplace) distribution. Further generalizations and extensions concern nonparametric regression and semiparametric models.
The last chapter contains results of simulations and statistical analysis of real data. To mention one of the interesting examples, all 16 criteria applied to the transformed Wolf yearly sunspot numbers \(y_t^*= \sqrt{y_t} - 6.298\) selected the AR(9) model, which agrees with earlier analyses.
The authors take into account many different aspects of model selection procedures and examine the performance of the proposed criteria. The presented material can serve as a reference book for specialists and also as important resource of information for statisticians dealing with applications.
Reviewer: J.Anděl (Praha)

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62-02 Research exposition (monographs, survey articles) pertaining to statistics
62J05 Linear regression; mixed models
62B10 Statistical aspects of information-theoretic topics
62G07 Density estimation