### THRUST 2: AI FOR PHYSICS-INFORMED MODELS

Our goal is to specifically learn physically interpretable models of dynamical systems from off-line and/or on-line streaming data. Physics informed learning is of growing importance for scientific and engineering problems. Physics-informed simply refers to our ability to constrain the learning process by physical and/or engineering principles. For instance, conservation of mass, momentum, or energy can be imposed in the learning process. In the parlance of ML, the imposed constraints are referred to as regularizers. Thus, physics informed learning focuses on adding regularization to the learning process to impose or enforce physical priors. There are four major stages in machine learning: 1) determining a high- level task or objective, 2) collecting and curating the training data, 3) identifying the model architecture and parameterization, and 4) choosing an optimization strategy to determine the parameters of the model from the data. Known physics (e.g., invariances, symmetries, conservation laws, constraints, etc.) may be incorporated in each of these stages. For example, rotational invariance is often incorporated by augment- ing the training data with rotated copies, and translational invariance is often captured using convolutional neural network architectures. In kernel-based techniques, such as Gaussian process regression and sup- port vector machines, symmetries can be imposed by means of rotation-invariant, translation-invariant, and symmetric covariance kernels. Additional physics and prior knowledge may be incorporated as additional loss functions or constraints in the optimization problem.

**FACULTY THRUST LEADS**

**MISSION STATEMENT**

Our mission is to leverage data and AI to automatically discover the underlying mechanisms and mathematical representations that best explain observations across a variety of evolving/changing natural systems. We aim to responsibly use these mathematical formulations to gain insights, make predictions, and discover governing equations. We also aim to improve machine learning processes for time-series data and to understand their limits.

**GOALS & MILESTONES**

1st year:

• AI Institute Inaugural Workshop: presentations, posters, and networking

• Quarterly institute-wide meetings to share research, form collaborations and discuss goals

• Recruitment effort to graduate students and postdocs for AI Institute projects

• Produce a diversity of publications, software, and preprints

2nd year:

• AI common task framework workshop 2023: Curate data sets, both synthetic and experimental, for developing and testing learning algorithms

• “Modeling modelability”: Develop principles for understanding which datasets can be modeled and which can’t

• Combine statistical techniques with low-rank representations for systems with control

• Understand the difference between statistical models and physics-based models. What is the difference between machine learning for physical systems vs machine learning for images/text/speech by looking at fluid flows

• Generate dataset from molecular dynamics for coarse graining, adding new features to Molecular Dynamics Simulation (JAX-MD: https://github.com/jax-md/jax-md, open-source)

• Develop computationally cheaper and accurate (efficient) reduced order models to complex transient fluid flows (turbulent, non-Newtonian, multiphase) for simulation-generated data

3rd year:

• Validate algorithms on test data sets using expert domain knowledge, including their weaknesses, consistency, and failure modes

• Identify important problems that are lacking in data resources, and then develop open-source datasets and simulations for use in modeling these problems

• Develop strong baseline models for the problems outlined in year 2. Attempt to innovate and improve upon them by experimenting with different and new methods of modeling.

• Develop high fidelity physics simulations for system identification methods (Broad)

• Terramechanics simulation environment built on finite element modeling for planetary surface robots embodying contact rich dynamics (specific)

• System identification of terramechanics model capturing rigid body state evolution (specific)

• Apply coarse graining methods to generated dataset from MD and use the coarse grained features to accelerate MD simulations

• Map the space of datasets: Develop foundations for comparing datasets by understanding the similarity of the underlying generating mechanisms

5th year:

• Apply algorithms across new domains to gain fundamental understanding of myriad systems

• Automate the matching between modeling algorithms, representations, and datasets

• Re-analyze what is similar and what is different between statistical model vs physics-based model

• Summarize the learnings gained on how to improve upon the modeling baselines developed in year 3. Analyze the implications of these results, both from a theoretical perspective (mathematically), and empirical perspective (use for a practitioner).

• Develop open-source uncertainty modeling toolkits for experimental data validation

• Connect MD simulations with CFD (i.e. connect features between JAX-MD and JAX-CFD: https://github.com/google/jax-cfd, open-source) for better modeling and design of biological matter

• Real-time system identification of terramechanics in field tests

• AI Institute Inaugural Workshop: presentations, posters, and networking

• Quarterly institute-wide meetings to share research, form collaborations and discuss goals

• Recruitment effort to graduate students and postdocs for AI Institute projects

• Produce a diversity of publications, software, and preprints

2nd year:

• AI common task framework workshop 2023: Curate data sets, both synthetic and experimental, for developing and testing learning algorithms

• “Modeling modelability”: Develop principles for understanding which datasets can be modeled and which can’t

• Combine statistical techniques with low-rank representations for systems with control

• Understand the difference between statistical models and physics-based models. What is the difference between machine learning for physical systems vs machine learning for images/text/speech by looking at fluid flows

• Generate dataset from molecular dynamics for coarse graining, adding new features to Molecular Dynamics Simulation (JAX-MD: https://github.com/jax-md/jax-md, open-source)

• Develop computationally cheaper and accurate (efficient) reduced order models to complex transient fluid flows (turbulent, non-Newtonian, multiphase) for simulation-generated data

3rd year:

• Validate algorithms on test data sets using expert domain knowledge, including their weaknesses, consistency, and failure modes

• Identify important problems that are lacking in data resources, and then develop open-source datasets and simulations for use in modeling these problems

• Develop strong baseline models for the problems outlined in year 2. Attempt to innovate and improve upon them by experimenting with different and new methods of modeling.

• Develop high fidelity physics simulations for system identification methods (Broad)

• Terramechanics simulation environment built on finite element modeling for planetary surface robots embodying contact rich dynamics (specific)

• System identification of terramechanics model capturing rigid body state evolution (specific)

• Apply coarse graining methods to generated dataset from MD and use the coarse grained features to accelerate MD simulations

• Map the space of datasets: Develop foundations for comparing datasets by understanding the similarity of the underlying generating mechanisms

5th year:

• Apply algorithms across new domains to gain fundamental understanding of myriad systems

• Automate the matching between modeling algorithms, representations, and datasets

• Re-analyze what is similar and what is different between statistical model vs physics-based model

• Summarize the learnings gained on how to improve upon the modeling baselines developed in year 3. Analyze the implications of these results, both from a theoretical perspective (mathematically), and empirical perspective (use for a practitioner).

• Develop open-source uncertainty modeling toolkits for experimental data validation

• Connect MD simulations with CFD (i.e. connect features between JAX-MD and JAX-CFD: https://github.com/google/jax-cfd, open-source) for better modeling and design of biological matter

• Real-time system identification of terramechanics in field tests