Skip to the content.

Data-Driven Model Reduction, Scientific Frontiers, and Applications

Abhinav Narasingam, Artie McFerrin Department of Chemical Engineering
Data-Driven Identification of Interpretable Reduced-Order Models Using Sparse Regression


  • Abhinav Narasingam
  • Joseph Sang-Il Kwon


Reduced-order models (ROMs) can be thought of as computationally inexpensive mathematical representations, of a high-fidelity dynamical model that preserves essential behaviors and dominant effects that offer the potential for near real-time analysis. The field of reduced-order modeling is large, and new techniques are developing rapidly. Most of these model reduction methods can be classified into either subspace-based or projection-based methods. Subspace-based methods, including Subspace-based State Space System Identification (N4SID) [1], Multivariable Output Error State Space (MOESP) [2], etc., are attractive in that they provide accurate state-space models for multivariable linear systems directly from input-output data. They have also been applied for nonlinear systems with limited success provided the system is properly excited. However, the states of the identified system are void of any meaning in the physical sense and moreover, a large order is necessary to fully resolve the nonlinear dynamics. The projection or spectral-based methods involve decomposing a vector field into a set of modes and projecting the dynamics onto a low-dimensional space in which the solution lies. These methods assume a priori knowledge of the state evolution with system dynamics given by a set of partial or ordinary differential equations. Proper Orthogonal Decomposition [3], Eigensystem Realization [4] and Dynamic Mode Decomposition [5] are some of the widely used projection-based model reduction methods.

A recent breakthrough in nonlinear model identification has been triggered by the use of regression techniques to discover the governing equations of dynamical systems by balancing model complexity with accuracy [6,7]. In this work, we approach the reduced order modeling problem from the perspective of compressive sampling and sparse regression. We rely solely on time series data collected at a fixed number of spatial locations to identify parsimonious and physically interpretable ROMs. The fundamental assumption in this method is that there are only a few relevant terms that dictate the system dynamics. This is a reasonable assumption which holds for many physical systems as long as a set of appropriate basis functions is selected. Within this context, we perform nonlinear sparse regression on a large library of potential candidate functions to determine the fewest terms that most accurately represent the data. Therefore, the resulting ROMs are functions of the original system states and inputs, and the identified dynamical system is interpretable. The innovations in regression techniques have resulted in several attractive features, such as avoiding overfitting, selecting sparse models and sampling strategies to deal with large amounts of data, which can be readily applicable here. This method is not limited to linear model identification as several nonlinear terms can be included in the candidate library. Additionally, if any a priori information regarding the system is available, it can be easily incorporated into the algorithm. We demonstrate the performance of the proposed regression-based model reduction method to design a closed-loop operation of a hydraulic fracturing process, which is characterized by a system of nonlinear highly-coupled PDEs with time-dependent spatial domain [8]. In addition to achieving the desired model approximation accuracy, by introducing delayed inputs into the model the proposed algorithm readily realizes the time-delay structure which primarily characterizes a fracturing operation. We also compare the performance of the models identified using the proposed and the subspace-based MOESP methods. The results show that the proposed method is competitive with respect to the MOESP both in terms of ROM accuracy and closed-loop performance.


  1. P. Van Overschee and B. De Moor, N4SID: Subspace Algorithms for the Identification of Combined Deterministic-Stochastic Systems, Automatica 30 (), 75–93.
  2. P. Van Overschee and B. De Moor, Subspace Identification for Linear Systems: Theory – Implementation – Applications, New York: Springer ().
  3. P. Holmes, J.L Lumley, and G. Berkooz, Turbulence, Coherent Structures, Dynamical Systems and Symmetry, New York: Cambridge University Press ().
  4. J.N. Juang and R.S. Pappa, An Eigensystem Realization Algorithm for Modal Parameter Identification and Model Reduction, Journal of Guidance, Control, and Dynamics 8:5 (), 620–627.
  5. P.J. Schmid, Dynamic Mode Decomposition of Numerical and Experimental Data, J. Fluid Mech. 656 (), 5–28.
  6. J. Bongard and H. Lipson, Automated Reverse Engineering of Nonlinear Dynamical Systems, Proceedings of the National Academy of Sciences 104:24 (), 9943–9948.
  7. S.L. Brunton, J.L. Proctor, and J.N. Kutz, Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems, Proceedings of the National Academy of Sciences 113:15 (), 3932–3937.
  8. M.J. Economides and K.G. Nolte, Reservoir Stimulation, Chichester: Wiley ().