APPM 4720/5720 "Open Topics in Applied Mathematics: Scientific Machine Learning (SciML)"
University of Colorado Boulder, Fall 2024. Instructor Stephen Becker, with assistance from Nic Rummel, and course prep help from Cooper Simpson
For a detailed list of class policies see the policies document.
Machine learning, and in particular deep learning, has been successfully applied across almost any field imaginable. This results in a variety of perspectives one might take in learning and presenting material from this vast subject. This course takes the perspective of an applied mathematician and will cover topics specifically relevant to this background. Many other perspectives exist; see Related Courses for a list specific to CU.
We will call our perspective Scientifc Machine Learning, or SciML for short. SciML is a relatively new field and term, so a readily agreed upon definition is lacking. We will define it as follows:
Scientific Machine Learning, often abbreviated SciML, combines classical aspects of computational science with techniques in machine learning, with a focus on solving problems in the scientific domain.
Often this involves fusing topics considered traditionally in applied math such as partial differential equations (PDEs) with deep learning.
Examples of topics this course does not cover include natural language processing (NLP), LLM, robotics and computer vision. We consider these to be in the domain of other departments.
After the course, students will be ready to perform research in SciML (either in industry or academics), and will understand subtleties and pitfalls in both classical methods, data-driven methods and hybrid methods. They will understand pros and cons of various approaches, and be familiar with at least one recent approach in each of several domains of SciML.
With the learning goals in mind, the course is taught with PhD students in mind. Undergraduates are welcome to take the course but should be aware the course is designed for researchers.
- Students will understand standard ML concepts such as bias, variance, expressiveness, learnability, training, validation, over-fitting and generalization, and will learn best practices such as validation and cross-validation, and will also learn industry-standard packages and libraries (PyTorch or Jax in Python, or Lux in Julia) as well as other coding tools (debugging, profiling, logging and visualization, git). They will understand necessary concepts from probability
- Students will have a basic knowledge of optimization and training neural networks, being able to train their own neural network
- Students will understand basic HPC concepts including the memory hierarchy and how forward and reverse mode automatic differentiation work, as well as their limitations
- Students will know basic linear algebra and numerical analysis for solving differential equations, in particular they will be able to solve differential equations numerically using classical methods (not necessarily "from scratch", but at least understand how to use appropriate libraries), and understand basic physical modeling
- Finally, students will be aware of recent SciML approaches, know when SciML can be useful and what are shortcomings, and be able to implement an existing approach
- To this end, students will produce a final report involving a professional writeup and coding
- A theme for the course is "scientific debugging" (or the related ideas of Verification and Validation / Uncertainty Quantification), which includes all levels of usual code debugging, but also higher-level concepts applicable to the entire scientific workflow. In research, there is no answer in the back of the book, so how do we know when we get it right? How do we know a piece of code is correct? How do we know a discretization is accurate? We will use tools like validation, refinement studies, etc.
We currently plan to focus on Python and the PyTorch library, however we may require graduate students to do some assignments in two separate libraries or languages (e.g., Python and Julia) in order to see the similarities and differences.
In the future, we may switch to either Python with the Jax library, or to Julia with the Lux or Flux library.
Some demonstrations may be done in Julia in order to illustrate concepts
The topics below are not listed in the order they will be covered in class
- Numpy and basic scientific computing (floating point representations...)
- Frameworks
- PyTorch, Lightning, etc.
- Remote computing
- git and GitHub
- managing libraries and virtual environments
- IDEs
- debuggers, profilers, loggers
- Memory models
- GPU, TPU and SIMD
- Automatic Differentiation (AD)
- Forward vs backward mode
- Mathematical formulation
- Implementation
- Pros/cons of each, i.e. memory issues
- Mixed mode
- Limitations
- Complicated scenarios
- Loops, inverses, and recurrent networks
- Differentiation of iterative solvers
- Differentiable code
- Forward vs backward mode
- Types of solutions; stationary points; convexity
- Stochastic Gradient Descent (SGD) and variants
- Basic linear algebra; solving linear regression efficiently and accurately
- Roundoff error
- Solving ODEs "by hand" and finite differences
- Understand approaches to solve PDEs and integral equations (and calling libraries to solve them)
- Learning problem setup
- Supervised vs Unsupervised vs Reinforcement
- Goals; generalization; regularization; overfitting
- Simple models
- Activations
- Feed forward/Dense layers
- Convolutions
- Universal approximation theorem
- Going beyond MLPs
- Recurrent layers
- Residual connections
- Autoencoders / U-Nets
- Transformers
- Normalization
- Data
- Layers
- Misc
- Parameter initialization
- Train/Test/Validation
- Cross-validation
- Tricks (dropout, gradient clipping)
- Ablation studies
- Feature selection/engineering. Positional encodings
- Vanishing gradients, mode collapse
- Gaussian Processes
There are countless applications of deep learning, but we will foucs on those that are most relevant to practitioners in applied math.
- Infinite dimensional mappings
- Unstructured and Multi-Resolution Data
- Graph convolution
- Fourier convolution
- Quadrature convolution
- Implicit neural representation
- Invariance, equivariance, and enforcing physical laws
- Forward problems
- Closure modeling
- Surrogate and reduced-order modeling (ROM)
- Inverse problems
- Uncertainty Quantification
These will typically follow specific papers. These are just examples of what we might follow
- Neural DE
- Implicit neural representation (ex: SIRENs)
- Neural DEs
- Physics informed neural nets (PINNs)
- relation to FOSLS; drawbacks, comparisons to FEM, limitations
- Neural Operators
- Deep-operator networks (Deep-O nets)
- Fourier neural operator
- Wavelet and Laplace operator
- Solving forward problems
- (Deterministic and Stochastic) Generative Models for ROM and UQ
- Diffusion models
- Autoencoders (AE), Variational AE (VAE) and Generative Adversarial Networks (GAN)
- Examples: Machine learning techniques to construct patched analog ensembles for data assimilation (by CU authors Yang and Grooms), Bi-fidelity Variational Auto-encoder for Uncertainty Quantification (by CU authors Cheng et al.)
- Other applications
- Inverse problems and exploiting a known forward operator, such as unrolling approaches like this MRI neural net
- Closure modeling, Invariant data-driven subgrid stress modeling in the strain-rate eigenframe for large eddy simulation (by CU authors Prakash et al.)
- Time series and RNN, Density Estimation for Entry Guidance Problems using Deep Learning (by CU authors Rataczak et al.)
- Compression of scientific data
- Bayesian Neural Networks
- Reinforcement learning (RL) for protein folding (AlphaFold)
- RL for matrix multiplication (AlphaZero)
- Universal instability theorem and implications
As this is a recent field, there's not a standard textbook to follow. We'll use supplementary material (often from the web) as needed, such as the MIT and NVIDIA courses mentioned below.
This list is not exhaustive and mostly focuses on CSCI. Departments such as ASEN, LING, INFO, and others also run classes covering relevant topics.
- CSCI 5922: Neural Networks and Deep Learning
- CSCI (4/5)622: Machine Learning
- CSCI 3832/5832: Natural Language Processing
- CSCI (4/5)722: Computer Vision
- CSCI (4/5)576: High-Performance Scientific Computing
- CSCI 5822: Probabilistic Modeling of Human and Machine Intelligence
A few courses from outside CU:
- MIT Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications, which uses Julia
- NVIDIA Deep Learning for Science and Engineering , co-developed by Karniadakis (PINNs and DeepOnets) and uses PyTorch
- Brittany Erickson’s MATH 607 Seminar on Physics-Informed Deep Learning at U Oregon which did a lot of PINNs