Robust Deep Learning Algorithms for System Identification

Negrini, Elisa

Etd

Robust Deep Learning Algorithms for System Identification

Public

In this dissertation we develop mathematically-principled deep learning algorithms for system identification. Our algorithms are completely data-driven, robust to noise and can be used to solve a variety of real world problems coming from areas such as economics, biology and finance. Governing laws for dynamical systems x'(t) = f(t,x) have traditionally been derived from expert knowledge and first principles, however in recent years the large amount of data available resulted in a growing interest in data-driven approaches for automated model discovery. System identification techniques aim at approximating the right-hand side f using only observed data from the system. In this work we investigate the problem of approximating unknown governing equations, i.e. a vector-valued RHS f, directly from observations of x(t) using a neural network N. We provide two main contributions. The first one is that we use a Lipschitz regularized neural network N to learn governing equations from data. In particular, we add a Lipschitz regularization term to our loss function to force the Lipschitz constant of the approximating network to be small. We empirically show that this regularization results in a smoother approximating function and better generalization properties when compared with non-regularized models, especially in presence of noise in the data. These results are in line with the theoretical work of Calder and Oberman about Lipschitz regularization and with the work of Bartlett about generalization where the authors show that smaller network weights result in better generalization properties for the network. The second main novelty of this work is that we propose three techniques for the data preprocessing: one based on Splines, one based on Kalman filters and one based on a family of neural networks. The target data used to train the network N are approximations of the velocity vector x'(t), which act as a prior for the unknown function f. The quality of the target data strongly influences the quality of the approximation of the function f: if the target data is not representative of the derivative function f, then the network will not provide an accurate approximation of the function f. It is clear then that a fundamental component of our model is the data preprocessing and the generation of accurate target data. In the examples, we show that our techniques result in models that are able to avoid overfitting and that are robust to noise. We also show that, since the neural network preprocessing is based on weak notion of solution using integration, it can be used to reconstruct differential equations with non-smooth RHS. Moreover, since we use a family of neural networks, we can do the reconstruction even for large amounts of noise in the data (up to 10\%). On the contrary, other existing methods in the literature can only accurately reconstruct equations with smooth solutions and lower noise amounts.

Creator