Introduction to Scientific Machine Learning (SciML)

This exercise is part of the NumSim course, where you have already learned how to design numerical solvers, discretize PDEs, and analyze stability and accuracy. In this assignment, we extend these ideas and explore a modern and rapidly growing field: Scientific Machine Learning (SciML).

What Is SciML?

Scientific Machine Learning combines traditional scientific computing (e.g., solving differential equations, fluid dynamics, optimization) with modern machine learning tools (deep learning, neural networks, representation learning).

The goal is not to replace numerical solvers. Instead, SciML aims to complement classical methods by:

accelerating simulations,
approximating complex solution operators,
learning unknown physical parameters,
providing surrogate models for repeated queries,
and enabling simulations where traditional solvers are too expensive.

Why Is SciML Relevant?

Modern engineering and scientific applications often require solving the same PDE many times:

parameter studies (varying Reynolds number, viscosity, geometry),
inverse problems,
real-time control,
uncertainty quantification,
optimization loops.

Running a full CFD solver thousands of times can be prohibitively expensive. Neural networks can learn to approximate the mapping:

(initial condition, parameters) → (solution field)

Once trained, such models can evaluate new scenarios in milliseconds, making them attractive for real-time or large-scale tasks.

What You Will Learn in This Exercise

In this assignment, you will develop a neural surrogate model for the 2D lid-driven cavity flow problem.

You will:

Generate simulation data using variations of the Reynolds number.
Prepare the dataset by extracting fields, constructing input channels, and applying normalization.
Design and implement a convolutional neural network (CNN) that predicts steady-state velocity fields from given inputs.
Train and evaluate the model using proper splitting (train/val/test), visualization, and metrics.
Test generalization on out-of-distribution (OOD) cases: - new Reynolds numbers, - larger geometries, - changed boundary conditions.
Submit your results to a small Kaggle competition and compare performance with other groups.

Why a CNN?

You will use a convolutional neural network (CNN) because:

the inputs and outputs are 2D fields,
convolutions naturally encode local interactions (similar to PDE stencils),
CNNs are computationally efficient on grids,
they are simple enough for a first ML-based solver.

This exercise focuses on understanding foundations, not on using extremely large or complex architectures.

How Does This Connect to Numerical Simulation?

Neural networks can be viewed as learned discretization schemes.

In classical numerical simulation, you design rules like:

u[i, j] ← function of neighboring points

A CNN does something similar: each convolutional filter computes an update based on local neighbors.

However:

in numerical solvers, the stencil is derived from physics,
in a CNN, the stencil is learned from data.

This allows the network to approximate the PDE solution operator directly.

Learning Outcomes

By the end of this exercise, you should understand:

how to connect simulations with machine learning workflows,
how to build and train neural networks for PDE-related tasks,
differences between fitting data and learning physics,
strengths and limitations of ML solvers,
basic SciML terminology and methodology.