×

Geostatistical methods for disease mapping and visualisation using data from spatio-temporally referenced prevalence surveys. (English) Zbl 07763618

Summary: In this paper, we set out general principles and develop geostatistical methods for the analysis of data from spatio-temporally referenced prevalence surveys. Our objective is to provide a tutorial guide that can be used in order to identify parsimonious geostatistical models for prevalence mapping. A general variogram-based Monte Carlo procedure is proposed to check the validity of the modelling assumptions. We describe and contrast likelihood-based and Bayesian methods of inference, showing how to account for parameter uncertainty under each of the two paradigms. We also describe extensions of the standard model for disease prevalence that can be used when stationarity of the spatio-temporal covariance function is not supported by the data. We discuss how to define predictive targets and argue that exceedance probabilities provide one of the most effective ways to convey uncertainty in prevalence estimates. We describe statistical software for the visualisation of spatio-temporal predictive summaries of prevalence through interactive animations. Finally, we illustrate an application to historical malaria prevalence data from 1 334 surveys conducted in Senegal between 1905 and 2014.
{© 2018 The Authors. International Statistical Review © 2018 International Statistical Institute}

MSC:

62-XX Statistics

Software:

animation; PrevMap

References:

[1] ANSD (2015). Sénǵal: Enquête démographique et de santĆontinue (EDS‐Continue 2014).Rockville, Maryland, USA: Agence Nationale de la Statistique et de la Démographie and ICF International.
[2] Bennett, A., Kazembe, L., Mathanga, D., Kinyoki, D., Ali, D., Snow, R. & Noor, A.M. (2013). Mapping malaria transmission intensity in Malawi, 2000‐2010.Am. J. Trop. Med. Hyg., 89, 840-849.
[3] Bonat, W.H & Ribeiro, P.J. (2016). Practical likelihood analysis for spatial generalized linear mixed models. Environmetrics, 27, 83-89. env.2375. · Zbl 1525.62078
[4] Chipeta, M.G., Terlouw, D.J., Phiri, K.S. & Diggle, P.J. (2016). Adaptive geostatistical design and analysis for prevalence surveys. Spatial Stat., 15, 70-84.
[5] Christensen, O.F. (2004). Monte Carlo maximum likelihood in model‐based geostatistics. J. Comput. Graphical Stat., 3, 702-718.
[6] Clements, A., Lwambo, N., Blair, L., Nyandindi, U., Kaatano, G., Kinung’hi, S., Webster, J., Fenwick, A. & Brooker, S. (2006). Bayesian spatial analysis and disease mapping: Tools to enhance planning and implementation of a schistosomiasis control programme in Tanzania. Trop. Med. Int. Health, 11, 490-503.
[7] Diggle, P.J. & Giorgi, E. (2016). Model‐based geostatistics for prevalence mapping in low‐resource setting (with discussion). J. Am. Stat. Assoc.https://doi.org/10.1080/01621459.2015.1123158. · doi:10.1080/01621459.2015.1123158
[8] Diggle, P.J., Menezes, R. & Su, T. (2010). Geostatistical inference under preferential sampling. J. R. Stat. Soc., Ser. C, 59, 191-232.
[9] Diggle, P.J., Moyeed, R., Rowlingson, B. & Thomson, M. (2002). Childhood malaria in the Gambia: A case‐study in model‐based geostatistics. J. R. Stat. Soc., Ser. C, 51, 493-506. · Zbl 1111.62335
[10] Diggle, P.J. & Ribeiro, P.J. (2007). Model‐based Geostatistics.New York: Springer Science+Business Media. · Zbl 1132.86002
[11] Diggle, P.J., Tawn, J.A. & Moyeed, R.A. (1998). Model‐based geostatistics (with discussion). Appl. Stat., 47, 299-350. · Zbl 0904.62119
[12] Fletcher, R. (1987). Practical Methods of Optimization, 2nd ed. New York: John Wiley & Sons. · Zbl 0905.65002
[13] Fong, Y., Rue, H. & Wakefield, J. (2010). Bayesian inference for generalized linear mixed models. Biostatistics, 11, 397. · Zbl 1437.62460
[14] Gething, P.W., Elyazar, I.R.F., Moyes, C.L., Smith, D.L., Battle, K.E., Guerra, C.A., Patil, A.P., Tatem, A.J., Howes, R.E., Myers, M.F., George, D.B., Horby, P., Wertheim, H.F.L., Price, R.N., Meller, I., Baird, J.K. & Hay, S.I. (2012). A long neglected world malaria map: Plasmodium vivax endemicity in 2010. PLoS Negl. Trop. Dis., 6, e1814.
[15] Geyer, C.J. (1994). On the convergence of Monte Carlo maximum likelihood calculations. J. R. Stat. Soc., Ser. B, 56, 261-274. · Zbl 0784.62019
[16] Geyer, C.J. (1996). Estimation and optimization of functions. In Markov Chain Monte Carlo in Practice, Eds. W.Gilks (ed.), S.Richardson (ed.) & D.Spiegelhalter (ed.), pp. 241-258. London: Chapman and Hall. · Zbl 0841.62044
[17] Geyer, C.J. (1999). Likelihood inference for spatial point processes. In Stochastic Geometry, Likelihood and Computation, Eds. O.E.Barndorff‐Nielsen (ed.), W.S.Kendall (ed.) & M.N.M.van Lieshout (ed.), pp. 79-140. Boca Raton, FL: Chapman and Hall/CRC.
[18] Geyer, C.J. & Thompson, E.A. (1992). Constrained Monte Carlo maximum likelihood for dependent data. J. R. Stat. Soc., Ser. B, 54, 657-699.
[19] Giorgi, E. & Diggle, P.J. (2017). Prevmap: An R package for prevalence mapping. J. Stat. Softw., 78, 1-29.
[20] Giorgi, E., Sesay, S.S.S., Terlouw, D.J. & Diggle, P.J. (2015). Combining data from multiple spatially referenced prevalence surveys using generalized linear geostatistical models. J. R. Stat. Soc., Ser. A, 178, 445-464.
[21] Gneiting, T. (2002). Nonseparable, stationary covariance functions for space‐time data. J. Am. Stat. Assoc., 97, 590-600. · Zbl 1073.62593
[22] Hansell, A.L., Beale, L.A., Ghosh, R.E., Fortunato, L., Fecht, D., Järup, L. & Elliott, P. (2014). The Environment and Health Atlas for England and Wales.Oxford: Oxford University Press.
[23] Hay, S.I., Guerra, C.A., Gething, P.W., Patil, A.P., Tatem, A.J., Noor, A.M., Kabaria, C.W., Manh, B.H., Elyazar, I.R.F., Brooker, S., Smith, D.L., Moyeed, R.A. & Snow, R.W. (2009). A world malaria map: Plasmodium falciparum endemicity in 2007. PLoS Med., 6, e1000048.
[24] Hedt, B.L. & Pagano, M. (2011). Health indicators: Eliminating bias from convenience sampling estimator. Stat. Med., 30, 560-568.
[25] Higdon, D. (1998). A process‐convolution approach to modeling temperatures in the North Atlantic Ocean. Environ. Ecol. Stat., 5, 173-190.
[26] Higdon, D. (2002). Space and space‐time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues, Eds. C.W.Anderson (ed.), V.Barnett (ed.), P.C.Chatwin (ed.) & A.H.El‐Shaarawi (ed.), pp. 37-56. New York: Springer‐Verlag. · Zbl 1255.86016
[27] Hodges, J.S. & Reich, B.J. (2010). Adding spatially‐correlated errors can mess up the fixed effect you love. Am. Stat., 64, 325-334. · Zbl 1217.62095
[28] Holmes, C.C. & Held, L. (2006). Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Anal., 1, 145-168. · Zbl 1331.62142
[29] Joe, H. (2008). Accuracy of Laplace approximation for discrete response mixed models. Comput. Stat. Data Anal., 52, 5066-5074. · Zbl 1452.62537
[30] Kabaghe, A.N., Chipeta, M.G., McCann, R.S., Phiri, K.S., van Vugt, M., Takken, W., Diggle, P. & Terlouw, A.D. (2017). Adaptive geostatistical sampling enables efficient identification of malaria hotspots in repeated cross‐sectional surveys in rural Malawi. PLOS ONE, 12, 1-14.
[31] Kleinschmidt, I., Pettifor, A., Morris, N., MacPhail, C. & Rees, H. (2007). Geographic distribution of human immunodeficiency virus in South Africa. Am J Trop Med Hyg, 77, 1163-1169.
[32] Kleinschmidt, I., Sharp, B.L., Clarke, G.P.Y., Curtis, B. & Fraser, C. (2001). Use of generalized linear mixed models in the spatial analysis of small‐area malaria incidence rates in Kwazulu Natal, South Africa. Am. J. Epidemiol., 153, 1213-1221.
[33] Lindgren, F., Rue, H. & Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B, 73, 423-498. · Zbl 1274.62360
[34] López‐Abente, G., Ramis, R., Pollán, M., Aragonés, N., Pérez‐Gómez, B., Gómez‐Barroso, D., Carrasco, J.M., Lope, V., Garciá‐Pérez, J., Boldo, E. & García‐Mendizábal, M.J. (2007). Atlas Municipale de Mortalidad por Cancér en Espana 1989‐1998.Madrid: Instituto de Salud Carlos III.
[35] Lumley, T. & Scott, A. (2017). Fitting regression models to survey data. Stat. Sci., 32, 265-278. · Zbl 1381.62274
[36] Matérn, B. (1986). Spatial Variation, 2nd ed. Berlin: Springer. · Zbl 0608.62122
[37] Mercer, L.D., Wakefield, J., Pantazis, A., Lutambi, A.M., Masanja, H. & Clark, S. (2015). Spacetime smoothing of complex survey data: Small area estimation for child mortality. Ann. Appl. Stat., 9, 1889-1905. · Zbl 1397.62461
[38] Neal, R.M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo, Eds. S.Brooks (ed.), A.Gelman (ed.), G.Jones (ed.) & X.‐L. Meng (ed.), pp. 113-162. chap. 5, Chapman & Hall: CRC Press. · Zbl 1229.65018
[39] Noor, A.M., Kinyoki, D.K., Mundia, C.W., Kabaria, C.W., Mutua, J.W., Alegana, V.A., Fall, I.S.& Snow, R.W. (2014). The changing risk of plasmodium falciparum malaria infection in Africa: 2000-10: A spatial and temporal analysis of transmission intensity. The Lancet, 383, 1739-1747.
[40] Paciorek, C.J. (2010). The importance of scale for spatial‐confounding bias and precision of spatial regression estimators. Stat. Sci., 25, 107-125. · Zbl 1328.62596
[41] Pati, D., Reich, B.J. & Dunson, D.B. (2011). Bayesian geostatistical modelling with informative sampling locations. Biometrika, 98, 35-48. · Zbl 1214.62029
[42] Pullan, R.L., Gething, P.W., Smith, J.L., Mwandawiro, C.S., Sturrock, H.J.W., Gitonga, C.W., Hay, S.I. & Brooker, S. (2011). Spatial modelling of soil‐transmitted helminth infections in Kenya: A disease control planning tool. PLoS Negl. Trop. Dis., 5, e958.
[43] Raso, G., Matthys, B., N’goran, E.K., Tanner, B., Vounatsou, P. & Utzinger, J. (2005). Spatial risk prediction and mapping of schistosoma mansoni infections among schoolchildren living in western Côte d’Ivoire. Parasitology, 131, 97-108.
[44] Roberts, G.O. & Rosenthal, J.S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc.: Ser. B (Stat. Method.), 60, 255-268. · Zbl 0913.60060
[45] Rodrigues, A. & Diggle, P.J. (2010). A class of convolution‐based models for spatio‐temporal processes with non‐separable covariance structure. Scand. J. Stat., 37, 553-567. · Zbl 1349.62449
[46] RStudio Inc (2013). Easy web applications in R.Available at http://www.rstudio.com/shiny/. Acessed on 1 January 2017.
[47] Rue, H., Martino, S. & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc., Ser. B, 71, 319-392. · Zbl 1248.62156
[48] Skinner, C. & Wakefield, J. (2017). Introduction to the design and analysis of complex survey data. Stat. Sci., 32(2), 165-175. · Zbl 1381.62031
[49] Snow, R., Amratia, P., Mundia, C., Alegana, V., Kirui, V., Kabaria, C. & Noor, A. (2015a). Assembling a geo‐coded repository of malaria infection prevalence survey data in Africa 1900-2014. In Tech. rep. INFORM Working Paper, developed with support from the Department of International Development and Wellcome Trust, UK, June 2015. Avilable at http://www.inform-malaria.org/wp-content/uploads/2015/07/ Assembly-of-Parasite-Rate-Data-Version-1.pdf. Acessed on 1 January 2017.
[50] Snow, R.W., Kibuchi, E., Karuri, S.W., Sang, G., Gitonga, C.W., Mwandawiro, C., Bejon, P. & Noor, A.M. (2015b). Changing malaria prevalence on the Kenyan coast since 1974: Climate, drugs and vector control. PLoS ONE, 10, 1-14.
[51] Soares Magalhaes, R.J. & Clements, A.C.A. (2011). Mapping the risk of anaemia in preschool‐age children: The contribution of malnutrition, malaria, and helminth infections in West Africa. PLoS Med., 8, e1000438.
[52] Stein, M.L. (2005). Space: Time covariance functions. J. Am. Stat. Assoc., 100, 310-321. · Zbl 1117.62431
[53] Thomson, M.C., Connor, S.J., D’Alessandro, U., Rowlingson, B., Diggle, P., Cresswell, M. & Greenwood, B. (1999). Predicting malaria infection in Gambian children from satellite data and bed net use surveys: The importance of spatial correlation in the interpretation of results.Am. J. Trop. Med. Hyg., 61, 2-8.
[54] Weller, Z.D. & Hoeting, J.A. (2016). A review of nonparametric hypothesis tests of isotropy properties in spatial data. Stat. Sci., 31, 305-324. · Zbl 1442.62213
[55] Xie, Y. (2013). animation: An R package for creating animations and demonstrating statistical methods. J. Stat. Softw., 53, 1-27.
[56] Zhang, H. (2002). On estimation and prediction for spatial generalized linear mixed models. Biometrics, 58, 129-136. · Zbl 1209.62161
[57] Zouré, G.M.H., Noma, M., Tekle, H., Amazigo, U.V., Diggle, P.J., Giorgi, E. & Remme, J.H.F. (2014). The geographic distribution of onchocerciasis in the 20 participating countries of the African programme for onchocerciasis control: (2) Pre‐control endemicity levels and estimated number infected. Parasit Vectors, 7, 326.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.