Lecture 6: Constrained Optimization and Polynomial Reproduction for Learning Finite Difference Stencils

Topics: Constrained optimization, polynomial reproduction, moving least squares, Schur complement method

0. Overview

This lecture establishes the mathematical foundations for learning finite difference (FD) stencils from data while enforcing physical and mathematical constraints. We introduce constrained optimization as a central tool for ensuring that learned numerical methods satisfy desired properties such as conservation laws, consistency, and stability.

The lecture connects three critical ideas: (1) equality-constrained quadratic programming using Lagrange multipliers and the Schur complement, (2) strategies for training FD stencils from noisy data (penalty methods, reduced-space optimization, and Lagrange multiplier methods), and (3) polynomial reproduction as a systematic framework for constructing high-quality stencils that exactly replicate polynomial solutions. The polynomial reproduction framework builds on moving least squares (MLS) and provides a dual optimization perspective: we seek minimal-norm stencil weights that satisfy reproduction constraints.

This material bridges classical numerical analysis (consistency, stability, convergence) with modern machine learning (data-driven stencil discovery, constrained neural networks). Understanding these techniques is essential for designing physics-informed neural networks and learning numerical schemes that respect fundamental physical principles like energy conservation and translation invariance.

1. Constrained Optimization Framework

1.1 General Constrained Optimization Problem

1.2 Equality Constrained Quadratic Programming

Note: For nonlinear problems, replace $A$ with the Hessian and $B$ with the Jacobian.

1.3 Schur Complement Solution

Step 2: Extract the constraint $Bx = d$ from the second row of (KKT) and substitute:

Important: We will use this Schur complement formula repeatedly as a tool to enforce properties we desire as equality constraints in stencil learning.

2. Training Strategies for Data-Driven Stencil Learning

2.1 A Note on Force Matching vs. Solution Fitting

In homework problems, we often match derivatives (also called force matching in molecular dynamics):

2.2 Three Approaches to Constrained Fitting

Method 1: Penalty Method (Worst!)

Method 2: Reduced Space Constrained Optimization

Substitute the residual constraint back into $\mathcal{L}_R$ to optimize directly over the space of FD solutions.

Challenge: For nonlinear/implicit schemes, need to backpropagate through a linear solve or Newton/iterative nonlinear solve.

Method 3: Lagrange Multipliers (Recommended)

Advantage: Backpropagation doesn't need to go through forward + adjoint solve; best for nonlinear/implicit schemes.

2.3 Mini-Batching

3. Polynomial Reproduction and Least Squares

3.1 Scattered Data and Fill Distance

Consider points $\bar{X} = \{x_1, \ldots, x_N\} \subseteq \Omega \subseteq \mathbb{R}^d$.

Interpretation: Radius of the biggest ball in $\Omega$ without any data points in it.

3.2 Moving Least Squares (MLS)

where $\Pi_m(\mathbb{R}^d)$ denotes polynomials of degree $\leq m$ in $\mathbb{R}^d$.

Simple Example: $m = 0$ (Constant Approximation)

Solution:

Step 1: Take derivative with respect to $c(x)$ and set to zero:

$$0 = \sum_{i=1}^N (f(x_i) - c(x)) \Phi_\delta(x - x_i)$$

Step 2: Solve for $c(x)$:

$$c(x) = \frac{\sum_i \Phi_\delta(x - x_i) f(x_i)}{\sum_i \Phi_\delta(x - x_i)}$$

This recovers kernel density estimation, also known as the Shepard interpolant.

4. Primal and Dual Formulations of MLS

Remark: We can thus search for a minimal norm stencil that has reproduction properties.

4.1 Proof: Setup and Notation

4.2 Primal Problem

Solution:

Step 1: Expand the objective:

$$\mathcal{L} = \frac{1}{2} c^\top P W P^\top c - c^\top P W u + \text{(terms independent of } c\text{)}$$

Step 2: Recall calculus identities:

$$\frac{\partial}{\partial x} \frac{1}{2} x^\top A x = \frac{1}{2}(A + A^\top)x, \quad \frac{\partial}{\partial x} y^\top x = y$$

Step 3: Take gradient and set to zero:

$$\frac{\partial \mathcal{L}}{\partial c} = PWP^\top c - PWu = 0$$ $$\Rightarrow c = (PWP^\top)^{-1} PWu$$

Step 4: The MLS approximation is:

$$\begin{aligned} S_{f,\bar{X}}(x) &= c \cdot P(x) \\ &= P(x)^\top (PWP^\top)^{-1} PWu \end{aligned}$$

4.3 Dual Problem

Interpretation: Find minimal weighted norm stencil weights $a(x)$ that reproduce all polynomials up to degree $m$.

Applying the Schur complement formula:

The KKT system is:

$$\begin{pmatrix} W^{-1} & P^\top \\ P & 0 \end{pmatrix} \begin{pmatrix} a(x) \\ \lambda \end{pmatrix} = \begin{pmatrix} 0 \\ P(x) \end{pmatrix}$$

Step 1: Identify Schur complement:

$$S = P W P^\top$$

Step 2: Solve for $\lambda$:

$$\lambda = -S^{-1} P(x) = -(PWP^\top)^{-1} P(x)$$

Step 3: Solve for $a(x)$:

$$\begin{aligned} a(x) &= -W P^\top \lambda \\ &= W P^\top S^{-1} P(x) \end{aligned}$$

Step 4: The MLS approximation is:

$$\begin{aligned} S_{f,\bar{X}}(x) &= u^\top a(x) \\ &= u^\top W P^\top S^{-1} P(x) \\ &= P(x)^\top (PWP^\top)^{-1} PWu \end{aligned}$$

This matches the primal formulation! ✓

5. Extension to Differential Operators

Question: Is the polynomial reproduction set non-empty? (i.e., do solutions always exist?)

6. Existence Theorem for Polynomial Reproduction

Theorem (Wendland, "Scattered Data Approximation," Theorem 4.7):

Suppose $\Omega \subseteq \mathbb{R}^d$ is compact and satisfies an interior cone condition with angle $\theta \in (0, \pi/2)$ and radius $r$. Then there exists stencil weights $a_j^*(x)$ for any $x$ such that:

Polynomial reproduction: $$ \sum_i a_j^*(x) p(x_j) = p(x) \quad \forall p \in \Pi_m(\mathbb{R}^d) $$
Bounded stability: $$ \sum_j |a_j^*(x)| \leq \tilde{c}_1 $$
Compact support (locality): $$ a_j^*(x) = 0 \quad \text{if} \quad \|x - x_j\|_2 > \tilde{c}_2 h_{\bar{X}, \Omega} $$

where $\tilde{c}_1$ and $\tilde{c}_2$ are independent of $h_{\bar{X}, \Omega}$ and can be explicitly derived.

Summary

This lecture established three key frameworks for learning constrained numerical methods:

Constrained optimization via Lagrange multipliers and Schur complement — provides a systematic approach to enforce equality constraints (conservation laws, consistency conditions) in learned stencils
Three training strategies for data-driven stencil learning:
- Penalty methods (avoid due to difficulty choosing $\lambda$)
- Reduced-space optimization (efficient but requires differentiating through solvers)
- Lagrange multipliers (most flexible, decouples optimization from PDE solve)
Polynomial reproduction framework — connects moving least squares, primal-dual optimization, and stencil design; guarantees existence of stable, local stencils that exactly reproduce polynomials
Wendland's theorem — ensures polynomial reproduction stencils exist, are stable, and have compact support under mild geometric conditions

Key Takeaway: Constrained optimization provides the mathematical machinery to design physics-informed learning algorithms that respect fundamental properties like conservation laws and polynomial consistency. The primal-dual formulation reveals that seeking minimal-norm stencils with polynomial reproduction is equivalent to moving least squares approximation, unifying classical numerical analysis with modern data-driven methods.

Next steps: Apply these principles to construct energy-conserving time integrators using Hamiltonian mechanics and symplectic structure (Lecture 7).