×

An ODE-based neural network with Bayesian optimization. (English) Zbl 1531.65086

Summary: An application of the Bayesian optimization to an ordinary differential equation-based neural network is proposed. The loss function was considered as a black box function of the coefficients, and Bayesian optimization was applied to obtain desirable parameter values. The proposed method drastically simplifies the implementation because the adjoint method-based updating of coefficients is not required. Numerical experiments demonstrate that the performance of the proposed method is comparable to that of existing methods.

MSC:

65K15 Numerical methods for variational inequalities and related problems
65L07 Numerical investigation of stability of solutions to ordinary differential equations
68T07 Artificial neural networks and deep learning

Software:

torchdiffeq
Full Text: DOI

References:

[1] E. Weinan, A proposal on machine learning via dynamical systems, Commun. Math. Stat., 5 (2017), 1-11. · Zbl 1380.37154
[2] D. Ruiz-Balet and E. Zuazua, Neural ODE control for classification, approximation and transport, HAL (2023), hal-03997183.
[3] R. T. Q. Chen, Y. Rubanova, J. Bettencourt and D. Duvenaud, Neural ordinary differential equations, Adv. Neural. Inf. Process. Syst., 31 (2018), 6572-6583.
[4] L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, J. Math. Imaging Vis., 62 (2020), 352-364. · Zbl 1434.68522
[5] H. Honda, On a partial differential equation based neural network, IEICE Commun. Express, 10 (2021), 137-143.
[6] .
[7] W. E. J. Han and Q. Li, A mean-field optimal control formulation of deep learning, Res. Math. Sci., 6 (2019), 10. · Zbl 1421.49021
[8] P. I. Frazier and W. Jialei, Bayesian Optimization for Materials Design, in: T. Lookman, F. J. Alexander, and K. Rajan (eds), Information Science for Materials Discovery and Design, pp. 45-75, Springer International Publishing, Cham, 2015.
[9] .
[10] B. Shahriari et al., Taking the human out of the loop: a review of Bayesian optimization, Proc. of IEEE, 104 (2016), 148-175.
[11] R. Garnett et al., Bayesian optimization for sensor set selection, in: Proc. of ACM/IEEE International Conference on Information Processing in Sensor Networks, pp. 209-219, IEEE, 2010.
[12] .
[13] .
[14] E. Giné and R. Nickle, Mathematical Foundations of Infinite-Dimensional Statistical Models, Cambridge University Press, Cambridge, 2015.
[15] H. J. Kushner, A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise, J. Basic Eng., 86 (1964), 97-106.
[16] J. Mockus, Application of Bayesian approach to numerical methods of global and stochastic optimization, J. Global Optim., 4 (1994), 347-365. · Zbl 0801.90099
[17] P. L. Bartlett et al., Convexity, classification, and risk bounds, J. Am. Stat. Assoc., 101 (2006), 138-156. · Zbl 1118.62330
[18] M. E. Sander et al., Momentum residual neural networks, in: Proc. of the 38th International Conference on Machine Learning, pp. 1527-1554, 2021.
[19] GpyOpt, https://sheffieldml.github.io/GPyOpt/ (accessed 9 Aug. 2023).
[20] scikit-optimize, https://scikit-optimize.github.io/stable/ (accessed 9 Aug. 2023).
[21] scipy.integrate.odeint, https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.odeint.html (accessed 9 Aug. 2023).
[22] https://www.kaggle.com/code/prasadperera/the-boston-housing-dataset (accessed 9 Aug. 2023).
[23] https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html (accessed 9 Aug. 2023).
[24] .
[25] S. Takeno et al., Multi-fidelity Bayesian optimization with max-value entropy search and its parallelization, Proc. Mach. Learn. Res., 119 (2020), 9334-9345.
[26] R. Stone and K. Price, Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces, J. Global Optim., 11 (1997), 341-359. · Zbl 0888.90135
[27] .
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.