×

Spatially dependent multiple testing under model misspecification, with application to detection of anthropogenic influence on extreme climate events. (English) Zbl 1462.62707

Summary: The Weather Risk Attribution Forecast (WRAF) is a forecasting tool that uses output from global climate models to make simultaneous attribution statements about whether and how greenhouse gas emissions have contributed to extreme weather across the globe. However, in conducting a large number of simultaneous hypothesis tests, the WRAF is prone to identifying false “discoveries”. A common technique for addressing this multiple testing problem is to adjust the procedure in a way that controls the proportion of true null hypotheses that are incorrectly rejected, or the false discovery rate (FDR). Unfortunately, generic FDR procedures suffer from low power when the hypotheses are dependent, and techniques designed to account for dependence are sensitive to misspecification of the underlying statistical model. In this article, we develop a Bayesian decision-theoretical approach for dependent multiple testing and a nonparametric hierarchical statistical model that flexibly controls false discovery and is robust to model misspecification. We illustrate the robustness of our procedure to model error with a simulation study, using a framework that accounts for generic spatial dependence and allows the practitioner to flexibly specify the decision criteria. Finally, we apply our procedure to several seasonal forecasts and discuss implementation for the WRAF workflow. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

MSC:

62P12 Applications of statistics to environmental and related topics
62M30 Inference from spatial processes
62H15 Hypothesis testing in multivariate analysis

Software:

nimble; CAM3; GMRFLib; spBayes

References:

[1] Allen, M., Liability for Climate Change, Nature, 421, 891-892 (2003)
[2] Angélil, O.; Perkins-Kirkpatrick, S.; Alexander, L. V.; Stone, D.; Donat, M. G.; Wehner, M.; Shiogama, H.; Ciavarella, A.; Christidis, N., Comparing Regional Precipitation and Temperature Extremes in Climate Model and Reanalysis Products, Weather and Climate Extremes, 13, 35-43 (2016)
[3] Angélil, O.; Stone, D.; Wehner, M.; Paciorek, C. J.; Krishnan, H.; Collins, W., An Independent Assessment of Anthropogenic Attribution Statements for Recent Extreme Temperature and Rainfall Events, Journal of Climate, 30, 5-16 (2017)
[4] Angélil, O.; Stone, D. A.; Tadross, M.; Tummon, F.; Wehner, M.; Knutti, R., Attribution of Extreme Weather to Anthropogenic Greenhouse Gas Emissions: Sensitivity to Spatial and Temporal Scales, Geophysical Research Letters, 41, 2150-2155 (2014)
[5] Arellano-Valle, R. B.; Azzalini, A., The Centred Parametrization for the Multivariate Skew-Normal Distribution,, Journal of Multivariate Analysis, 99, 1362-1382 (2008) · Zbl 1140.62040
[6] Arent, D. J.; Tol, R. S. J.; Faust, E.; Hella, J. P.; Kumar, S.; Strzepek, K. M.; Tóth, F. L.; Yan, D.; Abdulla, A.; Kheshgi, H., Key Economic Sectors and Services, Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, 659-708 (2014), Cambridge University Press
[7] Armagan, A.; Dunson, D. B.; Lee, J., Generalized Double Pareto Shrinkage, Statistica Sinica, 23, 119-143 (2013) · Zbl 1259.62061
[8] Azzalini, A.; Capitanio, A., Distributions Generated by Perturbation of Symmetry with Emphasis on a Multivariate Skew t-Distribution,, Journal of the Royal Statistical Society, 65, 367-389 (2003) · Zbl 1065.62094
[9] Banerjee, S.; Carlin, B. P.; Gelfand, A. E., Hierarchical Modeling and Analysis for Spatial Data. Monographs on Statistics and Applied Probability (2004), London: Chapman & Hall, London · Zbl 1053.62105
[10] Benjamini, Y.; Hochberg, Y., Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,, Journal of the Royal Statistical Society, 57, 289-300 (1995) · Zbl 0809.62014
[11] On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics, Journal of Educational and Behavioral Statistics, 25, 60-83 (2000)
[12] Benjamini, Y.; Yekutieli, D., The Control of the False Discovery Rate in Multiple Testing Under Dependency, The Annals of Statistics, 29, 1165-1188 (2001) · Zbl 1041.62061
[13] Besag, J.; York, J.; Mollié, A., Bayesian Image Restoration, with Two Applications in Spatial Statistics, Annals of the Institute of Statistical Mathematics, 43, 1-20 (1991) · Zbl 0760.62029
[14] Bickel, P. J.; Levina, E., Covariance Regularization by Thresholding, The Annals of Statistics, 36, 2577-2604 (2008) · Zbl 1196.62062
[15] Carvalho, C. M.; Polson, N. G.; Scott, J. G., The Horseshoe Estimator for Sparse Signals, Biometrika, 97, 465-480 (2010) · Zbl 1406.62021
[16] Cattell, R. B., The Scree Test for the Number of Factors, Multivariate Behavioral Research, 1, 245-276 (1966)
[17] Cressie, N. A. C., Statistics for Spatial Data (1991), New York: Wiley, New York · Zbl 0799.62002
[18] Cressie, N.; Wikle, C. K., Statistics for Spatio-Temporal Data (Wiley Series in Probability and Statistics) (2011), New York: Wiley, New York · Zbl 1273.62017
[19] Daniels, M. J.; Kass, R. E., Shrinkage Estimators for Covariance Matrices, Biometrics, 57, 1173-1184 (2001) · Zbl 1209.62132
[20] De Valpine, P.; Turek, D.; Paciorek, C. J.; Anderson-Bergman, C.; Lang, D. T.; Bodik, R., Programming with Models: Writing Statistical Algorithms for General Model Structures with Nimble, Journal of Computational and Graphical Statistics, 26, 403-413 (2017)
[21] Efron, B., Size, Power and False Discovery Rates, 1351-1377 (2007) · Zbl 1123.62008
[22] Efron, B.; Tibshirani, R.; Storey, J. D.; Tusher, V., Empirical Bayes Analysis of a Microarray Experiment, Journal of the American Statistical Association, 96, 1151-1160 (2001) · Zbl 1073.62511
[23] Farrington, C. P.; Manning, G., Test Statistics and Sample Size Formulae for Comparative Binomial Trials with Null Hypothesis of Non-Zero Risk Difference or Non-Unity Relative Risk, Statistics in Medicine, 9, 1447-1454 (1990)
[24] Fernández, C.; Steel, M. F., On Bayesian Modeling of Fat Tails and Skewness, Journal of the American Statistical Association, 93, 359-371 (1998) · Zbl 0910.62024
[25] Frühwirth-Schnatter, S.; Pyne, S., Bayesian Inference for Finite Mixtures of Univariate and Multivariate Skew-Normal and Skew-t distributions, Biostatistics, 11, 317-336 (2010) · Zbl 1437.62465
[26] Genovese, C.; Wasserman, L., Operating Characteristics and Extensions of the False Discovery Rate Procedure,, Journal of the Royal Statistical Society, 64, 499-517 (2002) · Zbl 1090.62072
[27] Guan, Y.; Haran, M., A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models (2018) · Zbl 07498984
[28] Guindani, M.; Müller, P.; Zhang, S., A Bayesian Discovery Procedure,, Journal of the Royal Statistical Society, 71, 905-925 (2009) · Zbl 1411.62224
[29] Hansen, G.; Auffhammer, M.; Solow, A. R., On the Attribution of a Single Event to Climate Change, Journal of Climate, 27, 8297-8301 (2014)
[30] Hughes, J.; Haran, M., Dimension Reduction and Alleviation of Confounding for Spatial Generalized Linear Mixed Models,, Journal of the Royal Statistical Society, 75, 139-159 (2013) · Zbl 07555442
[31] Jolliffe, I. T., Principal Component Analysis (2002), New York: Springer-Verlag, New York · Zbl 1011.62064
[32] Josse, J.; Husson, F., Selecting the Number of Components in Principal Component Analysis using Cross-Validation Approximations, Computational Statistics & Data Analysis, 56, 1869-1879 (2012) · Zbl 1243.62082
[33] Junttila, V.; Kauranne, T.; Finley, A. O.; Bradford, J. B., Linear Models for Airborne-Laser-Scanning-Based Operational Forest Inventory With Small Field Sample Size and Highly Correlated Lidar Data, IEEE Transactions on Geoscience and Remote Sensing, 53, 5600-5612 (2015)
[34] Katzfuss, M.; Hammerling, D.; Smith, R., A Bayesian Hierarchical Model for Climate-Change Detection and Attribution, Geophysical Research Letters, 44, 5720-5728 (2017)
[35] Kelsall, J.; Wakefield, J., Modeling Spatial Variation in Disease Risk, Journal of the American Statistical Association, 97, 692-701 (2002) · Zbl 1073.62580
[36] Lawal, K. A.; Abatan, A. A.; Angélil, O.; Olaniyan, E.; Olusoji, V. H.; Oguntunde, P. G.; Lamptey, B.; Abiodun, B. J.; Shiogama, H.; Wehner, M. F.; Stone, D. A., The Late Onset of the 2015 Wet Season in Nigeria, Bulletin of the American Meteorological Society, 97, S63-S69 (2016)
[37] Lee, J.; Oh, H.-S., Bayesian Regression Based on Principal Components for High-Dimensional Data, Journal of Multivariate Analysis, 117, 175-192 (2013) · Zbl 1279.62064
[38] Leroux, B. G.; Lei, X.; Breslow, N., Estimation of Disease Rates in Small Areas: A New Mixed Model for Spatial Dependence (2000), New York, NY: Springer, pp. 179-191, New York, NY · Zbl 0957.62095
[39] Livezey, R. E.; Chen, W. Y., Statistical Field Significance and its Determination by Monte Carlo Techniques, Monthly Weather Review, 111, 46-59 (1983)
[40] Minka, T., Automatic Choice of Dimensionality for PCA, Advances in Neural Information Processing Systems, 13, 598-604 (2001)
[41] Müller, P.; Parmigiani, G.; Rice, K., FDR and Bayesian Multiple Comparisons Rules (2006)
[42] Müller, P.; Parmigiani, G.; Robert, C.; Rousseau, J., Optimal Sample Size for Multiple Testing: The Case of Gene Expression Microarrays, Journal of the American Statistical Association, 99, 990-1001 (2004) · Zbl 1055.62127
[43] National Academies of Sciences, E., and Medicine, Attribution of Extreme Weather Events in the Context of Climate Change (2016), Washington, DC: The National Academies Press, Washington, DC
[44] Neale, R. B.; Chen, C.-C.; Gettelman, A.; Lauritzen, P. H.; Park, S.; Williamson, D. L.; Conley, A. J.; Garcia, R.; Kinnison, D. Lamarque, J.-F.; Marsh, D.; Mills, M.; Smith, A. K.; Tilmes, S. Vitt, F.; Morrison, H.; Cameron-Smith, P.; Collins, W. D.; Iacono, M. J.; Easter, R. C.; Ghan, S. J.; Liu, X.; Rasch, P. J.; And Taylor, M. A., Description of the NCAR Community Atmosphere Model (CAM 5.0) (2010)
[45] Newton, M. A.; Noueiry, A.; Sarkar, D.; Ahlquist, P., Detecting Differential Gene Expression with a Semiparametric Hierarchical Mixture Method, Biostatistics, 5, 155-176 (2004) · Zbl 1096.62124
[46] Pacifico, M. P.; Genovese, C.; Verdinelli, I.; Wasserman, L., False Discovery Control for Random Fields, Journal of the American Statistical Association, 99, 1002-1014 (2004) · Zbl 1055.62105
[47] Pall, P.; Aina, T.; Stone, D. A.; Stott, P. A.; Nozawa, T.; Hilberts, A. G. J.; Lohmann, D.; Allen, M. R., Anthropogenic Greenhouse Gas Contribution to Flood Risk in England and Wales in Autumn 2000, Nature, 470, 382-385 (2011)
[48] Pascutto, C.; Wakefield, J.; Best, N.; Richardson, S.; Bernardinelli, L.; Staines, A.; Elliott, P., Statistical Issues in the Analysis of Disease Mapping Data, Statistics in Medicine, 19, 2493-2519 (2000)
[49] Polson, N. G.; Scott, J. G., Good, Great, or Lucky? Screening for Firms with Sustained Superior Performance using Heavy-Tailed Priors, The Annals of Applied Statistics, 6, 161-185 (2012) · Zbl 1235.91144
[50] Rue, H.; Held, L., Gaussian Markov Random Fields: Theory and Applications (Monographs on Statistics and Applied Probability (2005), Vol. 104), London: Chapman & Hall, Vol. 104), London · Zbl 1093.60003
[51] Shu, H.; Nan, B.; Koeppe, R., Multiple Testing for Neuroimaging via Hidden Markov Random Field, Biometrics, 71, 741-750 (2015) · Zbl 1419.62446
[52] Smith, K. R.; Woodward, A.; Campbell-Lendrum, D.; Chadee, D. D.; Honda, Y.; Liu, Q.; Olwoch, J. M.; Revich, B.; Sauerborn, R., Human Health: Impacts, Adaptation, and Co-Benefits, Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, 709-754 (2014), Cambridge University Press
[53] Stone, D. A.; Allen, M. R., The End-to-End Attribution Problem: From Emissions to Impacts, Climatic Change, 71, 303-318 (2005)
[54] Stone, D. A.; Risser, M. D.; Angélil, O. M.; Wehner, M. F.; Cholia, S.; Keen, N.; Krishnan, H.; O’Brien, T. A.; Collins, W. D., A Basis Set for Exploration of Sensitivity to Prescribed Ocean Conditions for Estimating Human Contributions to Extreme Weather in CAM5. 1-1Degree, Weather and Climate Extremes, 19, 10-19 (2018)
[55] Storey, J. D., The Positive False Discovery Rate: A Bayesian Interpretation and the q-Value, Annals of Statistics, 31, 2013-2035 (2003) · Zbl 1042.62026
[56] Stott, P. A.; Allen, M.; Christidis, N.; Dole, R. M.; Hoerling, M.; Huntingford, C.; Pall, P.; Perlwitz, J.; Stone, D., Attribution of Weather and Climate-Related Events (2013), Netherlands, Dordrecht: Springer, pp. 307-337, Netherlands, Dordrecht
[57] Stott, P. A.; Stone, D. A.; Allen, M. R., Human Contribution to the European Heatwave of 2003, Nature, 432, 610-614 (2004)
[58] Sun, W.; Cai, T. T., Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control, Journal of the American Statistical Association, 102, 901-912 (2007) · Zbl 1469.62318
[59] Large-Scale Multiple Testing Under Dependence,, Journal of the Royal Statistical Society, 71, 393-424 (2009) · Zbl 1248.62005
[60] Sun, W.; Reich, B. J.; Cai, T.; Guindani, M.; Schwartzman, A., False Discovery Control in Large-Scale Spatial Multiple Testing,, Journal of the Royal Statistical Society, 77, 59-83 (2015) · Zbl 1414.62043
[61] Taddy, M., Multinomial Inverse Regression for Text Analysis, Journal of the American Statistical Association, 108, 755-770 (2013) · Zbl 06224965
[62] Tansey, W.; Athey, A.; Reinhart, A.; Scott, J. G., Multiscale Spatial Density Smoothing: An Application to Large-Scale Radiological Survey and Anomaly Detection, Journal of the American Statistical Association, 112, 1047-1063 (2017)
[63] Ventura, V.; Paciorek, C. J.; Risbey, J. S., Controlling the Proportion of Falsely Rejected Hypotheses when Conducting Multiple Tests with Climatological Data, Journal of Climate, 17, 4343-4356 (2004)
[64] Wang, L., Bayesian Principal Component Regression with Data-Driven Component Selection, Journal of Applied Statistics, 39, 1177-1189 (2012) · Zbl 1514.62105
[65] Wikle, C. K., Low-Rank Representations for Spatial Processes, Handbook of Spatial Statistics, 107-118 (2010), London: Chapman and Hall/CRC, London
[66] Wilks, D. S., The Stippling Shows Statistically Significant Grid points: How Research Results are routinely Overstated and Overinterpreted, and What to do About it, Bulletin of the American Meteorological Society, 97, 2263-2273 (2016)
[67] Wold, S., Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models, Technometrics, 20, 397-405 (1978) · Zbl 0403.62032
[68] Zhang, Z.; Chan, K.; Kwok, J.; And Yeung, D., Bayesian Inference on Principal Component Analysis Using Reversible Jump Markov Chain Monte Carlo, Proceedings of Nineteenth National Conference on Artificial Intelligence (AAAI?04), San Jose, CA, 372 (2004)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.