×

Efficient inference of generalized spatial fusion models with flexible specification. (English) Zbl 07851103

Summary: In spatial statistics, data are often collected at different spatial resolutions. Often, it is of interest to (a) carry out multivariate analysis when variables are sampled at different locations, (b) model data collected at misaligned areas, or (c) unravel common latent factors by jointly modelling point and areal data. In this paper, we establish a linkage between the generalized spatial fusion model framework and the various change-of-support problems, and we outline how the framework can be adapted in these situations. Moreover, we propose an efficient fusion model implementation by exploiting advantages of nearest neighbour Gaussian process and the Stan modelling language. Our simulation shows that the computational efficiency is several times higher in the new implementation compared with original implementation. We illustrate the performance gain in practice using a case study, which models daily precipitation in Switzerland based on rain gauge and radar data.
{© 2019 The Authors. Stat Published by John Wiley & Sons Ltd.}

MSC:

62-XX Statistics

Software:

spBayes; R; Stan; BayesDA

References:

[1] Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2014). Hierarchical modeling and analysis for spatial data, pp. 136-139: CRC Press.
[2] Banerjee, S., Gelfand, A. E., Finley, A. O., & Sang, H. (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4), 825-848. https://doi.org/10.1111/j.1467-9868.2008.00663.x · Zbl 1533.62065 · doi:10.1111/j.1467-9868.2008.00663.x
[3] Bass, M. R., & Sahu, S. K. (2017). A comparison of centring parameterisations of Gaussian process‐based models for Bayesian computation using MCMC. Statistics and Computing, 27(6), 1491-1512. https://doi.org/10.1007/s11222-016-9700-z · Zbl 1384.62078 · doi:10.1007/s11222-016-9700-z
[4] Berrocal, V. J., Gelfand, A. E., & Holland, D. M. (2010). A spatio‐temporal downscaler for output from numerical models. Journal of Agricultural, Biological, and Environmental Statistics, 15(2), 176-197. https://doi.org/10.1007/s13253-009-0004-z · Zbl 1306.62243 · doi:10.1007/s13253-009-0004-z
[5] Betancourt, M. J., & Girolami, M. (2013). Hamiltonian Monte Carlo for hierarchical models. arXiv:1312.0906 [stat.ME].
[6] Bourgeois, A., Gaba, S., Munier‐Jolain, N., Borgy, B., Monestiez, P., & Soubeyrand, S.(2012). Inferring weed spatial distribution from multi‐type data. Ecological Modelling, 226, 92-98. https://doi.org/10.1016/j.ecolmodel.2011.10.010 · doi:10.1016/j.ecolmodel.2011.10.010
[7] Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434-455. https://doi.org/10.2307/1390675 · doi:10.2307/1390675
[8] Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., ..., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1). https://doi.org/10.18637/jss.v076.i01 · doi:10.18637/jss.v076.i01
[9] Cowles, M. K., Yan, J., & Smith, B. (2009). Reparameterized and marginalized posterior and predictive sampling for complex Bayesian geostatistical models. Journal of Computational and Graphical Statistics, 18(2), 262-282. https://doi.org/doi:10.1198/jcgs.2009.08012 · doi:10.1198/jcgs.2009.08012
[10] Cressie, N. (1991). Statistics for Spatial Data. Hoboken, NJ, USA: John Wiley. · Zbl 0799.62002
[11] Cressie, N., & Johannesson, G. (2007). Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1), 209-226. https://doi.org/10.1111/j.1467-9868.2007.00633.x · Zbl 05563351 · doi:10.1111/j.1467-9868.2007.00633.x
[12] Datta, A., Banerjee, S., Finley, A. O., & Gelfand, A. E. (2016). Hierarchical nearest‐neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111(514), 800-812. https://doi.org/10.1080/01621459.2015.1044091 · doi:10.1080/01621459.2015.1044091
[13] Diggle, P. J., Tawn, J., & Moyeed, R. (1998). Model‐based geostatistics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 47(3), 299-350. https://doi.org/10.1111/1467-9876.00113 · doi:10.1111/1467-9876.00113
[14] Fuentes, M., & Raftery, A. E. (2005). Model evaluation and spatial interpolation by Bayesian combination of observations with outputs from numerical models. Biometrics, 61(1), 36-45. https://doi.org/10.1111/j.0006-341X.2005.030821.x · Zbl 1077.62124 · doi:10.1111/j.0006-341X.2005.030821.x
[15] Fuentes, M., Reich, B., & Lee, G. (2008). Spatial‐temporal mesoscale modeling of rainfall intensity using gage and radar data. The Annals of Applied Statistics, 2(4), 1148-1169. https://doi.org/http://www.jstor.org/stable/30245129 · Zbl 1168.62103 · doi:https://doi.org/http://www.jstor.org/stable/30245129
[16] Furrer, R., Genton, M. G., & Nychka, D. (2006). Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15(3), 502-523. https://doi.org/10.1198/106186006X132178 · doi:10.1198/106186006X132178
[17] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis(3rd ed.). New York: CRC Press.
[18] Gotway, C. A., & Young, L. J. (2002). Combining incompatible spatial data. Journal of the American Statistical Association, 97(458), 632-648. https://doi.org/10.1198/016214502760047140 · Zbl 1073.62604 · doi:10.1198/016214502760047140
[19] Greenland, S. (1992). Divergent biases in ecologic and individual‐level studies. Statistics in Medicine, 11(9), 1209-1223. https://doi.org/10.1002/sim.4780110907 · doi:10.1002/sim.4780110907
[20] Guan, Y., & Haran, M. (2018). A computationally efficient projection‐based approach for spatial generalized linear mixed models. Journal of Computational and Graphical Statistics, 27, 701-714. https://doi.org/10.1080/10618600.2018.1425625 · Zbl 07498984 · doi:10.1080/10618600.2018.1425625
[21] Homan, M. D., & Gelman, A. (2014). The No‐U‐Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623. http://dl.acm.org/citation.cfm?id=2627435.2638586 · Zbl 1319.60150
[22] Lanckriet, G. R. G., De Bie, T., Cristianini, N., Jordan, M. I., & Noble, W. S. (2004). A statistical framework for genomic data fusion. Bioinformatics, 20(16), 2626-2635. https://doi.org/doi.org/10.1093/bioinformatics/bth294 · doi:10.1093/bioinformatics/bth294
[23] Liggins II, M., Hall, D., & Llinas, J. (2017). Handbook of multisensor data fusion: Theory and practice. Boca Raton, FL: CRC press.
[24] Liu, Z., Le, N. D., & Zidek, J. V. (2011). An empirical assessment of Bayesian melding for mapping ozone pollution. Environmetrics, 22(3), 340-353. https://doi.org/10.1002/env.1054 · doi:10.1002/env.1054
[25] MeteoSwiss (2018). Weather radar network. https://www.meteoswiss.admin.ch/home/measurement-and-forecasting-systems/atmosphere/weather-radar-network.html Accessed: 2018‐09‐30.
[26] Moraga, P., Cramb, S. M., Mengersen, K. L., & Pagano, M. (2017). A geostatistical model for combined analysis of point‐level and area‐level data using INLA and SPDE. Spatial Statistics, 21, 27-41. https://doi.org/10.1016/j.spasta.2017.04.006 · doi:10.1016/j.spasta.2017.04.006
[27] Nguyen, H., Cressie, N., & Braverman, A. (2012). Spatial statistical data fusion for remote sensing applications. Journal of the American Statistical Association, 107(499), 1004-1018. https://doi.org/10.1080/01621459.2012.694717 · Zbl 1395.62348 · doi:10.1080/01621459.2012.694717
[28] Papaspiliopoulos, O., Roberts, G. O., & Sköld, M. (2003). Non‐centered parameterisations for hierarchical models and data augmentation. In Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting (Bernardo, J. M. (ed.), Lindley, D. V. (ed.), Dawid, A. P. (ed.), Berger, J. O. (ed.), West, M. (ed.), Bayarri, M. J. (ed.), & Heckerman, D. (ed.), Eds.), Oxford University Press, USA.
[29] Papaspiliopoulos, O., Roberts, G. O., & Sköld, M. (2007). A general framework for the parametrization of hierarchical models. Statistical Science, 22(1), 59-73. https://doi.org/10.1214/088342307000000014 · Zbl 1246.62195 · doi:10.1214/088342307000000014
[30] R Core Team (2018). R: A language and Environment for Statistical Computing, R Foundation for Statistical Computing. Vienna, Austria.
[31] Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319-392. https://doi.org/10.1111/j.1467-9868.2008.00700.x · Zbl 1248.62156 · doi:10.1111/j.1467-9868.2008.00700.x
[32] Sahu, S. K., Gelfand, A. E., & Holland, D. M. (2010). Fusing point and areal level space‐time data with application to wet deposition. Journal of the Royal Statistical Society: Series C (Applied Statistics), 59(1), 77-103. https://doi.org/10.1111/j.1467-9876.2009.00685.x · doi:10.1111/j.1467-9876.2009.00685.x
[33] Shi, H., & Kang, E. L. (2017). Spatial data fusion for large non‐Gaussian remote sensing datasets. Stat, 6(1), 390-404. https://doi.org/10.1002/sta4.165 · Zbl 07850048 · doi:10.1002/sta4.165
[34] Sideris, I. V., Gabella, M., Erdin, R., & Germann, U. (2014). Real‐time radar‐rain‐gauge merging using spatio‐temporal co‐kriging with external drift in the alpine terrain of Switzerland. Quarterly Journal of the Royal Meteorological Society, 140(680), 1097-1111. https://doi.org/10.1002/qj.2188 · doi:10.1002/qj.2188
[35] Stan Development Team (2017). Stan modeling language users guide and reference manual. Version 2.17.0.
[36] Stein, M. L. (2008). A modeling approach for large spatial datasets. Journal of the Korean Statistical Society, 37(1), 3-10. https://doi.org/10.1016/j.jkss.2007.09.001 · Zbl 1196.62123 · doi:10.1016/j.jkss.2007.09.001
[37] Wang, C., Puhan, M. A., & Furrer, R. (2018). Generalized spatial fusion model framework for joint analysis of point and areal data. Spatial Statistics, 23, 72-90. https://doi.org/10.1016/j.spasta.2017.11.006 · doi:10.1016/j.spasta.2017.11.006
[38] Zhang, L., Datta, A., & Banerjee, S. (2018). Practical Bayesian modeling and inference for massive spatial datasets on modest computing environments. arXiv:1802.00495 [stat.ME].
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.