×

Solving inverse problems in stochastic models using deep neural networks and adversarial training. (English) Zbl 1506.65017

Summary: Inverse problems associated with stochastic models constitute a significant portion of scientific and engineering applications. In such cases the unknown quantities are distributions. The applicability of traditional methods is limited because of their demanding assumptions or prohibitive computational consumption; for example, maximum likelihood methods require closed-form density functions, and Markov Chain Monte Carlo needs a large number of simulations. We propose a new method that estimates the unknown distribution by matching the statistical properties between observed and simulated random processes. We leverage the expressive power of neural networks to approximate the unknown distribution and use a discriminative neural network for computing the statistical discrepancies between the observed and simulated random processes. We demonstrated numerically that the proposed methods can estimate both the model parameters and learn complicated unknown distributions.

MSC:

65C20 Probabilistic models, generic numerical methods in probability and statistics
62M45 Neural nets and related approaches to inference from stochastic processes
68T07 Artificial neural networks and deep learning
Full Text: DOI

References:

[1] Rudy, Samuel H.; Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan, Data-driven discovery of partial differential equations, Sci. Adv., 3, 4, Article e1602614 pp. (2017)
[2] Rudy, Samuel; Alla, Alessandro; Brunton, Steven L.; Kutz, J. Nathan, Data-driven identification of parametric partial differential equations, SIAM J. Appl. Dyn. Syst., 18, 2, 643-660 (2019) · Zbl 1456.65096
[3] Patel, Dhruv; Tibrewala, Raghav; Vega, Adriana; Dong, Li; Hugenberg, Nicholas; Oberai, Assad A., Circumventing the solution of inverse problems in mechanics through deep learning: Application to elasticity imaging, Comput. Methods Appl. Mech. Engrg., 353, 448-466 (2019) · Zbl 1441.74084
[4] Smith, Ralph C., Uncertainty Quantification: Theory, Implementation, and Applications, Vol. 12 (2013), Siam
[5] Yang, Yibo; Perdikaris, Paris, Adversarial uncertainty quantification in physics-informed neural networks, J. Comput. Phys., 394, 136-152 (2019) · Zbl 1452.68171
[6] Shao, Anqi, A Fast and Exact Simulation for CIR Process (2012), University of Florida Gainesville: University of Florida Gainesville FL, (PhD thesis)
[7] Daniels, H. E., The asymptotic efficiency of a maximum likelihood estimator, (Fourth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1 (1961), University of California Press: University of California Press Berkeley), 151-163 · Zbl 0166.14802
[8] Cramér, Harald, Mathematical Methods of Statistics, Vol. 9 (1999), Princeton university press · Zbl 0985.62001
[9] Le Cam, Lucien, On the asymptotic theory of estimation and testing hypotheses, (Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics (1956), The Regents of the University of California) · Zbl 0074.13504
[10] Akilan, Thangarajah; Wu, Q. M. Jonathan; Yang, Yimin, Fusion-based foreground enhancement for background subtraction using multivariate multi-model Gaussian distribution, Inform. Sci., 430, 414-431 (2018)
[11] Blei, David M.; Kucukelbir, Alp; McAuliffe, Jon D., Variational inference: A review for statisticians, J. Amer. Statist. Assoc., 112, 518, 859-877 (2017)
[12] Dashti, Masoumeh; Stuart, Andrew M., The Bayesian approach to inverse problems, (Handbook of Uncertainty Quantification (2016), Springer), 1-118
[13] Box, George E. P.; Tiao, George C., Bayesian Inference in Statistical Analysis, Vol. 40 (2011), John Wiley & Sons · Zbl 0850.62004
[14] Dempster, Arthur P., A generalization of Bayesian inference, J. R. Stat. Soc. Ser. B Stat. Methodol., 30, 2, 205-232 (1968) · Zbl 0169.21301
[15] Knill, David C.; Richards, Whitman, Perception as Bayesian Inference (1996), Cambridge University Press · Zbl 1154.62321
[16] Beal, Matthew James, Variational Algorithms for Approximate Bayesian Inference (2003), university of London London
[17] Andrieu, Christophe; De Freitas, Nando; Doucet, Arnaud; Jordan, Michael I., An introduction to MCMC for machine learning, Mach. Learn., 50, 1-2, 5-43 (2003) · Zbl 1033.68081
[18] Neal, Radford M., MCMC using hamiltonian dynamics, Handb. Markov Chain Monte Carlo, 2, 11, 2 (2011) · Zbl 1229.65018
[19] Johannes, Michael; Polson, Nicholas, MCMC methods for continuous-time financial econometrics, (Handbook of Financial Econometrics: Applications (2010), Elsevier), 1-72
[20] Van Ravenzwaaij, Don; Cassey, Pete; Brown, Scott D., A simple introduction to Markov Chain Monte Carlo sampling, Psychon. Bull. Rev., 25, 1, 143-154 (2018)
[21] Nowozin, Sebastian; Cseke, Botond; Tomioka, Ryota, F-GAN: Training generative neural samplers using variational divergence minimization, (Advances in Neural Information Processing Systems (2016)), 271-279
[22] Baydin, Atilim Gunes; Pearlmutter, Barak A.; Radul, Alexey Andreyevich; Siskind, Jeffrey Mark, Automatic differentiation in machine learning: a survey, J. Mach. Learn. Res., 18, 1-43 (2018) · Zbl 06982909
[23] Xu, Kailai; Darve, Eric, ADCME: Learning spatially-varying physical fields using deep neural networks (2020), arxiv preprint arXiv:2011.11955 · Zbl 1437.65192
[24] Liu, Dong C.; Nocedal, Jorge, On the limited memory BFGS method for large scale optimization, Math. Program., 45, 1-3, 503-528 (1989) · Zbl 0696.90048
[25] Zhu, Ciyou; Byrd, Richard H.; Lu, Peihuang; Nocedal, Jorge, Algorithm 778: L-BFGS-b: Fortran subroutines for large-scale bound-constrained optimization, ACM Trans. Math. Softw., 23, 4, 550-560 (1997) · Zbl 0912.65057
[26] Kladívko, Kamil, Maximum likelihood estimation of the Cox-Ingersoll-Ross process: the Matlab implementation, Tech. Comput. Prague, 7 (2007)
[27] Jarrow, Robert A., Modeling Fixed-Income Securities and Interest Rate Options (2002), Stanford University Press · Zbl 1080.91035
[28] Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua, Generative adversarial nets, (Advances in Neural Information Processing Systems (2014)), 2672-2680
[29] Arjovsky, Martin; Chintala, Soumith; Bottou, Léon, Wasserstein GAN (2017), arxiv preprint arXiv:1701.07875
[30] Kingma, Diederik P.; Ba, Jimmy, Adam: A method for stochastic optimization (2014), arxiv preprint arXiv:1412.6980
[31] Hinton, Geoffrey; Srivastava, Nitish; Swersky, Kevin, Lecture slides (2019), http://www.cs.toronto.edu/ tijmen/csc321/slides/lecture_slides_lec6.pdf. (Accessed on 04/01/2019)
[32] Skajaa, Anders, Limited Memory BFGS for Nonsmooth Optimization (2010), Citeseer, (Master’s Thesis)
[33] Jolicoeur-Martineau, Alexia, GANs beyond divergence minimization (2018), arxiv preprint arXiv:1809.02145
[34] Goodfellow, Ian, NIPS 2016 tutorial: Generative adversarial networks (2016), arxiv preprint arXiv:1701.00160
[35] Myung, In Jae, Tutorial on maximum likelihood estimation, J. Math. Psych., 47, 1, 90-100 (2003) · Zbl 1023.62112
[36] Ly, Alexander; Marsman, Maarten; Verhagen, Josine; Grasman, Raoul PPP; Wagenmakers, Eric-Jan, A tutorial on Fisher information, J. Math. Psych., 80, 40-55 (2017) · Zbl 1402.62318
[37] Golub, Gene H.; Van Loan, Charles F., Matrix Computations, Vol. 3 (2012), JHU press · Zbl 1268.65037
[38] Glasserman, Paul; Yu, Bin, Large sample properties of weighted Monte Carlo estimators, Oper. Res., 53, 2, 298-312 (2005) · Zbl 1165.65303
[39] Luenberger, David G., Investment Science (1997), OUP Catalogue
[40] Neyshabur, Behnam; Bhojanapalli, Srinadh; Chakrabarti, Ayan, Stabilizing GAN training with multiple random projections (2017), arxiv preprint arXiv:1705.07831
[41] Roth, Kevin; Lucchi, Aurelien; Nowozin, Sebastian; Hofmann, Thomas, Stabilizing training of generative adversarial networks through regularization, (Advances in Neural Information Processing Systems (2017)), 2018-2028
[42] Zhou, Changsheng; Zhang, Jiangshe; Liu, Junmin, Lp-WGAN: Using Lp-norm normalization to stabilize Wasserstein generative adversarial networks, Knowl.-Based Syst., 161, 415-424 (2018)
[43] Gibbs, Alison L.; Su, Francis Edward, On choosing and bounding probability metrics, Internat. Statist. Rev., 70, 3, 419-435 (2002) · Zbl 1217.62014
[44] Crooks, Gavin E., On measures of entropy and information, Tech. Note, 9, v4 (2017)
[45] Cover, Thomas M.; Thomas, Joy A., Elements of Information Theory (2012), John Wiley & Sons · Zbl 0762.94001
[46] Yu, Bin, Tutorial: Information theory and statistics (2008), https://www.icmla-conference.org/icmla08/slides1.pdf, (Accessed on 05/05/2019)
[47] Lin, Jianhua, Divergence measures based on the Shannon entropy, IEEE Trans. Inform. Theory, 37, 1, 145-151 (1991) · Zbl 0712.94004
[48] Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, Stephen Paul Smolley, Least squares generative adversarial networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2794-2802.
[49] Sriperumbudur, Bharath K.; Fukumizu, Kenji; Gretton, Arthur; Schölkopf, Bernhard; Lanckriet, Gert R. G., On integral probability metrics, \( \phi \)-divergences and binary classification (2009), arxiv preprint arXiv:0901.2698 · Zbl 1295.62035
[50] Sriperumbudur, Bharath K.; Fukumizu, Kenji; Gretton, Arthur; Schölkopf, Bernhard; Lanckriet, Gert R. G., On the empirical estimation of integral probability metrics, Electron. J. Stat., 6, 1550-1599 (2012) · Zbl 1295.62035
[51] Dudley, Richard M., Real Analysis and Probability (2018), Chapman and Hall/CRC · Zbl 0686.60001
[52] Steinwart, Ingo, On the influence of the kernel on the consistency of support vector machines, J. Mach. Learn. Res., 2, Nov, 67-93 (2001) · Zbl 1009.68143
[53] Li, Chun-Liang; Chang, Wei-Cheng; Cheng, Yu; Yang, Yiming; Póczos, Barnabás, MMD GAN: Towards deeper understanding of moment matching network, (Advances in Neural Information Processing Systems (2017)), 2203-2213
[54] Peyré, Gabriel; Cuturi, Marco, Computational optimal transport, Found. Trends Mach. Learn., 11, 5-6, 355-607 (2019)
[55] Chou, Ching-Sung; Lin, Hsien-Jen, Some properties of CIR processes, Stoch. Anal. Appl., 24, 4, 901-912 (2006) · Zbl 1103.60018
[56] Alessandro Rinaldo, Lecture Notes. http://www.stat.cmu.edu/ arinaldo/Teaching/36752/S18/Scribed_Lectures/Apr5.pdf.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.