×

Optimally weighted loss functions for solving PDEs with neural networks. (English) Zbl 1518.65143

The main goal of this paper is to generalize the method introduced in [M. Raissi et al., “Physics informed deep learning. I: Data-driven solutions of nonlinear partial differential equations”, Preprint, arXiv:1711.10561] for boundary value problems, introducing new loss functionals. The authors start by introducing a loss functional such that the one of [loc. cit.] can be seen as its Monte-Carlo approximation in a particular scenario.
A bridge between the minimization of the new loss functional and the approximation for the solution of well-posed boundary value problems is established in Theorem 1. The authors present several remarks justifying the use of this result to guarantee success in the application of neural networks to minimize the correspondent Monte Carlo approximation.
In Theorem 2, for linear boundary value problems, it is shown that the new loss functional is convex. Consequently, this loss functional does not have local minima. It should be pointed out that neural networks convert convex loss functionals into non-convex problems. Moreover, Theorem 2 requires the linearity of the boundary value problem.
In Section 3 is discussed a new choice of the loss functional introducing a new parameter referred as the loss weight. The main question is how we should fix this parameter, that is, what is its optimal choice with respect to a specific error measure. To give an answer to this question, the authors introduce the concept of \(\epsilon\)-closeness which can be seen as relative error for the approximate solution. Assuming that the approximate solution is \(\epsilon\)-close, the authors construct upper bounds for the interior and boundary loss terms of the loss functional depending on the true solution of the differential problem. These upper bounds are then used to fix the loss weight parameter. As the true solution is unknown, the authors replace it by the approximate solution obtaining a normalized loss functional in Section 4. To apply the results obtained in this paper, in Section 5 is discussed the network configuration, the computation of the loss functionals and the training algorithms. In Section 6, the authors compare the proposed approach with the one introduced in [loc. cit.] considering several differential problems: Laplace and Poisson equations with Dirichlet boundary conditions; a convection-diffusion equation with Dirichlet boundary conditions in \([0,1]\).
From the numerical experiments, we conclude that the approach introduced here shows, in general, higher accuracy than the method studied in [loc. cit.].

MSC:

65N99 Numerical methods for partial differential equations, boundary value problems
68T07 Artificial neural networks and deep learning
65N35 Spectral, collocation and related methods for boundary value problems involving PDEs
65K10 Numerical optimization and variational techniques
65C20 Probabilistic models, generic numerical methods in probability and statistics
65N75 Probabilistic methods, particle methods, etc. for boundary value problems involving PDEs
35J15 Second-order elliptic equations
76R50 Diffusion

Software:

LBFGS-B; DGM; Adam

References:

[1] Schmidhuber, J., Deep learning in neural networks: An overview (2014), ArXiv E-Prints arXiv:1404.7828
[2] Owhadi, H., BayesIan numerical homogenization, Multiscale Model. Simul., 13, 3, 812-828 (2015) · Zbl 1322.35002
[3] Raissi, M.; Perdikaris, P.; Karniadakis, G. E., Machine learning of linear differential equations using Gaussian processes, J. Comput. Phys., 348, 683-693 (2017) · Zbl 1380.68339
[4] Owhadi, H.; Scovel, C.; Sullivan, T., Brittleness of Bayesian inference under finite information in a continuous world, Electron. J. Stat., 9, 1, 1-79 (2015) · Zbl 1305.62123
[5] Raissi, M.; Perdikaris, P.; Karniadakis, G. E., Numerical Gaussian processes for time-dependent and nonlinear partial differential equations, SIAM J. Sci. Comput., 40, 1, A172-A198 (2018) · Zbl 1386.65030
[6] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, Neural Netw., 2, 5, 359-366 (1989) · Zbl 1383.92015
[7] Hornik, K.; Stinchcombe, M.; White, H., Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks, Neural Netw., 3, 5, 551-560 (1990)
[8] Lagaris, I. E.; Likas, A.; Fotiadis, D. I., Artificial neural networks for solving ordinary and partial differential equations (1997), ArXiv E-Prints arXiv:physics/9705023
[9] Sirignano, J.; Spiliopoulos, K., DGM: A Deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375, 1339-1364 (2018) · Zbl 1416.65394
[10] Raissi, M.; Perdikaris, P.; Karniadakis, G. E., Physics informed deep learning (Part I): Data-driven solutions of nonlinear partial differential equations (2017), ArXiv E-Prints arXiv:1711.10561
[11] Dockhorn, T., A discussion on solving partial differential equations using neural networks (2019), ArXiv E-Prints arXiv:1904.07200
[12] He, J.; Li, L.; Xu, J.; Zheng, C., ReLU deep neural networks and linear finite elements (2018), ArXiv E-Prints arXiv:1807.03973
[13] Han, J.; Jentzen, A.; Weinan, E., Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci., 115, 34, 8505-8510 (2018) · Zbl 1416.35137
[14] Chan-Wai-Nam, Q.; Mikael, J.; Warin, X., Machine learning for semi linear PDEs, J. Sci. Comput., 79, 3, 1667-1712 (2019) · Zbl 1433.68332
[15] Grohs, P.; Hornung, F.; Jentzen, A.; Von Wurstemberger, P., A proof that artificial neural networks overcome the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations (2018), arXiv preprint arXiv:1809.02362
[16] Berner, J.; Grohs, P.; Jentzen, A., Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of black-scholes partial differential equations, SIAM J. Math. Data Sci., 2, 3, 631-657 (2020) · Zbl 1480.60191
[17] Jentzen, A.; Salimova, D.; Welti, T., A proof that deep artificial neural networks overcome the curse of dimensionality in the numerical approximation of Kolmogorov partial differential equations with constant diffusion and nonlinear drift coefficients (2018), arXiv preprint arXiv:1809.07321
[18] Darbon, J.; Langlois, G. P.; Meng, T., Overcoming the curse of dimensionality for some Hamilton-Jacobi partial differential equations via neural network architectures, Res. Math. Sci., 7, 3, 1-50 (2020) · Zbl 1445.35119
[19] Wang, S.; Teng, Y.; Perdikaris, P., Understanding and mitigating gradient pathologies in physics-informed neural networks (2020), arXiv preprint arXiv:2001.04536
[20] Chakraborty, S., Transfer learning based multi-fidelity physics informed deep neural network, J. Comput. Phys., 426, Article 109942 pp. (2021) · Zbl 07510057
[21] Wesseling, P., Principles of Computational Fluid Dynamics (2000), Springer-Verlag: Springer-Verlag Berlin, Heidelberg · Zbl 0960.76002
[22] Hölder, O., Ueber einen Mittelwertsatz, (Nachrichten Von Der KÖnigl. Gesellschaft Der Wissenschaften Und Der Georg-Augusts-Universität Zu GÖttingen (1889)) · JFM 21.0260.07
[23] Draxler, F.; Veschgini, K.; Salmhofer, M.; Hamprecht, F. A., Essentially no barriers in neural network energy landscape (2018), ArXiv E-Prints arXiv:1803.00885
[24] Bach, F., Breaking the curse of dimensionality with convex neural networks, J. Mach. Learn. Res., 18, 1, 629-681 (2017) · Zbl 1433.68390
[25] Glorot, X.; Bengio, Y.; Teh, Y. W.; Titterington, M., Understanding the difficulty of training deep feedforward neural networks, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, vol. 9, 249-256 (2010), URL http://proceedings.mlr.press/v9/glorot10a.html
[26] Byrd, R. H.; Lu, P.; Nocedal, J.; Zhu, C., A limited memory algorithm for bound constrained optimization, SIAM J. Sci. Comput., 16, 5, 1190-1208 (1995) · Zbl 0836.65080
[27] Kingma, D. P.; Ba, J., Adam: A method for stochastic optimization (2014), ArXiv E-Prints arXiv:1412.6980
[28] Kailai, X., Deep learning for partial differential equations PDEs (2018)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.