Reference: Gustafsson et al., Chapters 1-2, Mirzaee et al., "On Generalized Moving Least Squares and Diffuse Derivatives"
Topics: Artificial Viscosity, Lax-Friedrichs Method, Lax-Wendroff Method, Implicit Schemes, Crank-Nicolson Method, Polynomial Reproduction, Accuracy Theorems for Finite Difference Stencils
In Lecture 4, we proved the Lax Equivalence Theorem: for well-posed problems with finite initial conditions, consistency + stability ⇒ convergence. This landmark result tells us that if a finite difference scheme approximates the PDE correctly (consistency) and doesn't amplify errors uncontrollably (stability), then the numerical solution will converge to the true PDE solution as the grid is refined. However, we also discovered a troubling fact: the simple explicit Euler method with centered differences for the transport equation \(\partial_t u = \partial_x u\) is unconditionally unstable, meaning \(|\hat{Q}| > 1\) for all finite ratios \(\lambda = k/h\) (where \(k\) is the time step and \(h\) is the spatial grid spacing). This scheme is consistent but unstable, and therefore it cannot converge—the amplification factor grows without bound, rendering the method useless in practice.
This raises a critical question: How do we stabilize finite difference schemes? In this lecture, we explore two fundamental approaches to achieving stability:
Understanding these stabilization strategies is essential not just for classical numerical analysis, but also for modern scientific machine learning. When we later design learned discretizations from data, we'll need to incorporate stabilization mechanisms—either through architectural choices that mimic implicit schemes or through learned regularization that acts like artificial viscosity. The goal is to reverse-engineer models that are not only accurate but also provably stable under iteration.
After establishing how to stabilize schemes, we turn to a more general question: How do we guarantee accuracy for arbitrary finite difference stencils? The second half of this lecture develops a powerful theorem based on polynomial reproduction: if a stencil exactly reproduces derivatives of polynomials up to degree \(m\), and if the stencil weights are appropriately bounded, then the stencil approximates the derivative of smooth functions with error \(O(h^{m+1-|\alpha|})\), where \(|\alpha|\) is the order of the derivative. This result, which connects to generalized moving least squares and meshfree methods, provides the theoretical foundation for designing high-order accurate discretizations.
Let's briefly recall the key results from Lecture 4 that motivate today's discussion.
Formally, if:
Then:
$$\lim_{k,h \to 0} \sup_{0 \leq t_n \leq T} \|U(x, t_n) - \text{Int}_N(u(x, t_n))\| = 0$$where \(\text{Int}_N\) is the trigonometric interpolation operator projecting the continuous solution onto the grid.
Key insight: For finite initial conditions, stability plus consistency implies convergence.
Consider the transport equation \(\partial_t u = \partial_x u\) with the explicit Euler + centered difference scheme:
$$\frac{v_j^{n+1} - v_j^n}{k} = D_0 v_j^n = \frac{v_{j+1}^n - v_{j-1}^n}{2h}$$Fourier mode analysis: For a single mode \(e^{i\omega x}\), the amplification factor is:
$$\hat{Q} = 1 + i\lambda \sin(\omega h)$$where \(\lambda = k/h\) is the CFL number (Courant-Friedrichs-Lewy ratio).
Stability analysis:
$$|\hat{Q}| = |1 + i\lambda \sin(\omega h)| = \sqrt{1 + \lambda^2 \sin^2(\omega h)} \geq 1$$For any finite \(\lambda\), we have \(|\hat{Q}| > 1\) for most modes, leading to unbounded growth. This scheme is:
This brings us to the central question: Can we modify the scheme to achieve stability while preserving consistency?
One approach to stabilize explicit schemes is to add a diffusion term that damps high-frequency oscillations. This technique, called artificial viscosity, deliberately introduces non-physical dissipation to control numerical instability.
Consider modifying the transport equation by adding a diffusive term:
$$\partial_t u = \partial_x u + \sigma h \partial_{xx} u$$where \(\sigma\) is a stabilization parameter and \(h\) is the grid spacing.
Key observations:
The discrete scheme becomes:
$$\frac{v_j^{n+1} - v_j^n}{k} = D_0 v_j^n + \sigma h D_+ D_- v_j^n$$where:
Step 1: Compute the amplification factor.
Recall the action of finite difference operators on Fourier modes:
$$\begin{aligned} h D_0 e^{i\omega x} &= i \sin(\omega h) e^{i\omega x} \\ h^2 D_+ D_- e^{i\omega x} &= -4 \sin^2\left(\frac{\omega h}{2}\right) e^{i\omega x} \end{aligned}$$Let \(\lambda = k/h\) (CFL number) and \(\xi = \omega h\) (scaled wavenumber). The amplification factor is:
$$\hat{Q} = 1 + i\lambda \sin\xi - 4\sigma\lambda \sin^2\left(\frac{\xi}{2}\right)$$Step 2: Compute \(|\hat{Q}|^2\).
Using \(|a + ib|^2 = a^2 + b^2\):
$$|\hat{Q}|^2 = \left(1 - 4\sigma\lambda \sin^2\frac{\xi}{2}\right)^2 + \lambda^2 \sin^2\xi$$Step 3: Expand and simplify using \(\sin^2\xi = 4\sin^2(\xi/2)\cos^2(\xi/2)\):
$$|\hat{Q}|^2 = 1 - \left(8\sigma\lambda - 4\lambda^2\right) \sin^2\frac{\xi}{2} + 4\lambda^2\left(4\sigma^2 - 1\right) \sin^4\frac{\xi}{2}$$For stability, we require \(|\hat{Q}|^2 \leq 1\) for all \(\xi\). This can be achieved in two ways.
Require both coefficients to be non-negative:
$$\begin{aligned} 8\sigma\lambda - 4\lambda^2 &\geq 0 \quad \Rightarrow \quad \sigma \geq \frac{\lambda}{2} \\ 4\lambda^2(4\sigma^2 - 1) &\leq 0 \quad \Rightarrow \quad \sigma \leq \frac{1}{2} \end{aligned}$$Conclusion: Choose \(\frac{\lambda}{2} \leq \sigma \leq \frac{1}{2}\) with \(0 < \lambda \leq 1\).
This gives the Lax-Friedrichs method at \(\sigma = \frac{1}{2\lambda}\).
Consider the worst-case scenario where \(\sin^2(\xi/2) = 1\) (i.e., \(\xi = \pi\), the Nyquist frequency). This analysis leads to:
$$\sigma \leq \frac{1}{2\lambda}$$Conclusion: For \(\sigma \geq \frac{1}{2}\), stability requires \(\lambda \leq \frac{1}{2\sigma}\).
This gives the Lax-Wendroff method at \(\sigma = \frac{\lambda}{2}\) (satisfying \(2\sigma\lambda = \lambda^2 \leq 1\) when \(\lambda \leq 1\)).
From the stability analysis above, we obtain two classical stable schemes for the transport equation.
Choice: \(\sigma = \frac{h}{2k} = \frac{1}{2\lambda}\)
Scheme:
$$v_j^{n+1} = \frac{1}{2}\left(v_{j+1}^n + v_{j-1}^n\right) - \frac{\lambda}{2}\left(v_{j+1}^n - v_{j-1}^n\right)$$where \(\lambda = k/h\).
Interpretation:
Choice: \(\sigma = \frac{k}{2h} = \frac{\lambda}{2}\)
Scheme:
$$v_j^{n+1} = v_j^n + \frac{k}{2h}(v_{j+1}^n - v_{j-1}^n) + \frac{k^2}{2h^3}(v_{j+1}^n - 2v_j^n + v_{j-1}^n)$$Interpretation:
Derivation motivation (Taylor series):
Starting from \(u_t = u_x\), we have \(u_{tt} = u_{xt} = u_{xx}\). A Taylor expansion in time gives:
$$u(t + k) = u(t) + k u_t + \frac{k^2}{2} u_{tt} = u(t) + k u_x + \frac{k^2}{2} u_{xx}$$The Lax-Wendroff method discretizes this expression exactly!
Key Takeaways:
Important note: These methods add non-physical terms to achieve stability. While practical, this approach has limitations—excessive dissipation can corrupt solutions. An alternative is to use implicit schemes, which achieve stability through algebraic structure rather than added viscosity.
Instead of adding artificial viscosity, we can achieve stability by evaluating spatial derivatives at the future time level \(n+1\) rather than the current level \(n\). This leads to implicit schemes, which require solving a linear system at each time step.
Scheme:
$$\frac{v_j^{n+1} - v_j^n}{k} = D_0 v_j^{n+1}$$Rearranging in operator form:
$$(I - k D_0) v^{n+1} = v^n$$Fourier analysis:
For a mode \(e^{i\omega x}\):
$$\hat{Q} = [1 - i\lambda \sin\xi]^{-1}$$Stability:
$$|\hat{Q}| = \frac{1}{|1 - i\lambda \sin\xi|} = \frac{1}{\sqrt{1 + \lambda^2 \sin^2\xi}} \leq 1$$for all \(\lambda, \xi\).
Conclusion: The implicit Euler scheme is unconditionally stable for any CFL number \(\lambda = k/h\).
Tradeoff:
A particularly elegant implicit scheme is the Crank-Nicolson method, which uses a trapezoidal rule in time by averaging the spatial derivative at times \(n\) and \(n+1\).
Scheme:
$$\frac{v_j^{n+1} - v_j^n}{k} = \frac{1}{2} D_0 v_j^{n+1} + \frac{1}{2} D_0 v_j^n$$Rearranging:
$$\left(I - \frac{k}{2} D_0\right) v_j^{n+1} = \left(I + \frac{k}{2} D_0\right) v_j^n$$Fourier analysis:
$$\hat{Q} = \frac{1 + \frac{i\lambda}{2}\sin\xi}{1 - \frac{i\lambda}{2}\sin\xi}$$Stability:
$$|\hat{Q}| = \left|\frac{1 + \frac{i\lambda}{2}\sin\xi}{1 - \frac{i\lambda}{2}\sin\xi}\right| = \frac{\sqrt{1 + (\lambda\sin\xi/2)^2}}{\sqrt{1 + (\lambda\sin\xi/2)^2}} = 1$$Remarkable property: \(|\hat{Q}| = 1\) exactly for all modes!
Advantages:
Disadvantages:
| Method | Type | Stability | Accuracy | Cost per step |
|---|---|---|---|---|
| Explicit Euler + Centered | Explicit | Unstable | 1st order time | Low (explicit) |
| Lax-Friedrichs | Explicit | \(\lambda \leq 1\) | 1st order time | Low (explicit) |
| Lax-Wendroff | Explicit | \(\lambda \leq 1\) | 2nd order time | Low (explicit) |
| Implicit Euler | Implicit | Unconditional | 1st order time | High (linear solve) |
| Crank-Nicolson | Implicit | Unconditional | 2nd order time | High (linear solve) |
Practical considerations:
Now that we understand how to stabilize finite difference schemes, we turn to a more general question: How do we guarantee the accuracy of arbitrary finite difference stencils? The answer lies in a beautiful theorem connecting polynomial reproduction to convergence rates.
This theory is particularly important for modern methods like meshfree methods, generalized finite differences, and learned discretizations, where stencils may be irregular or adapted to data. The key reference is the paper by Mirzaee et al., "On Generalized Moving Least Squares and Diffuse Derivatives".
Consider a generic finite difference stencil approximating a differential operator \(D^\alpha u\), where \(\alpha\) is a multi-index denoting the order of derivatives.
Multi-index notation:
Examples:
Generic stencil: A discrete approximation \(D_h^\alpha\) at grid point \(x_i\) is given by:
$$D_h^\alpha u_i = \sum_j S_{ji}^\alpha u_j$$where \(S_{ji}^\alpha\) are stencil weights and the sum is over neighboring grid points \(x_j\).
Key questions:
The following theorem provides a complete answer, establishing that polynomial reproduction is both necessary and sufficient for optimal accuracy.
Suppose the stencil \(D_h^\alpha\) satisfies:
1. Polynomial reproduction of order \(m\):
$$\sum_j S_{ji}^\alpha p_j = D^\alpha p(x_i) \quad \text{for all polynomials } p \in \mathcal{P}_m$$where \(\mathcal{P}_m\) is the space of polynomials of degree at most \(m\).
2. Bounded stencil weights:
$$\sum_j |S_{ji}^\alpha| \leq C h^{-|\alpha|}$$where \(C\) is independent of \(h\).
Then for smooth functions \(u \in C^{m+1}\):
$$\|D_h^\alpha u - D^\alpha u\|_{L^\infty} \leq C h^{m+1-|\alpha|} |u|_{C^{m+1}}$$where \(|u|_{C^{m+1}} = \max_{|\beta| = m+1} \|\partial^\beta u\|_{L^\infty}\) is the \((m+1)\)-th derivative seminorm.
Interpretation:
The proof is a beautiful application of polynomial approximation theory, combining the triangle inequality with Taylor series estimates.
Step 1: Use the triangle inequality to separate exact and approximate derivatives.
Let \(p \in \mathcal{P}_m\) be an arbitrary polynomial of degree at most \(m\). For any function \(u\):
$$|D_h^\alpha u - D^\alpha u| \leq |D^\alpha u - D^\alpha p| + |D^\alpha p - D_h^\alpha p| + |D_h^\alpha p - D_h^\alpha u|$$Step 2: Use polynomial reproduction to eliminate the middle term.
By assumption (1), the stencil exactly reproduces derivatives of \(p\):
$$D_h^\alpha p(x_i) = \sum_j S_{ji}^\alpha p_j = D^\alpha p(x_i)$$Therefore:
$$|D_h^\alpha u - D^\alpha u| \leq |D^\alpha u - D^\alpha p| + |D_h^\alpha p - D_h^\alpha u|$$Step 3: Apply the stencil to the difference \(u - p\).
$$|D_h^\alpha u - D_h^\alpha p| = \left|\sum_j S_{ji}^\alpha (u_j - p_j)\right| \leq \sum_j |S_{ji}^\alpha| |u_j - p_j|$$Taking the \(L^\infty\) norm:
$$\|D_h^\alpha u - D_h^\alpha p\|_{L^\infty} \leq \|u - p\|_{L^\infty} \sum_j |S_{ji}^\alpha|$$Step 4: Use the weight bound assumption (2).
$$\|D_h^\alpha u - D_h^\alpha p\|_{L^\infty} \leq C h^{-|\alpha|} \|u - p\|_{L^\infty}$$Step 5: Combine terms.
$$\|D_h^\alpha u - D^\alpha u\|_{L^\infty} \leq \|D^\alpha u - D^\alpha p\|_{L^\infty} + C h^{-|\alpha|} \|u - p\|_{L^\infty}$$Step 6: Choose \(p\) as the Taylor polynomial of \(u\) at \(x_i\).
By Taylor's theorem, the optimal polynomial approximation of degree \(m\) gives:
$$\begin{aligned} \|u - p\|_{L^\infty} &\leq C_1 h^{m+1} |u|_{C^{m+1}} \\ \|D^\alpha u - D^\alpha p\|_{L^\infty} &\leq C_2 h^{m+1-|\alpha|} |u|_{C^{m+1}} \end{aligned}$$Step 7: Substitute into the bound.
$$\|D_h^\alpha u - D^\alpha u\|_{L^\infty} \leq (C_2 + C C_1) h^{m+1-|\alpha|} |u|_{C^{m+1}}$$Conclusion:
$$\boxed{\|D_h^\alpha u - D^\alpha u\|_{L^\infty} \leq C h^{m+1-|\alpha|} |u|_{C^{m+1}}}$$Critical observations:
This lecture developed two fundamental tools for analyzing and designing finite difference schemes:
1. Numerical stabilization techniques:
2. Polynomial reproduction theorem:
Main Topics Covered:
Key Takeaway: Stability can be achieved either by adding dissipation (artificial viscosity) or by implicit formulations (solving linear systems). For learning problems, we must design architectures that either incorporate stabilization mechanisms or learn them from data while preserving convergence guarantees. The polynomial reproduction theorem provides a powerful tool for ensuring accuracy of both conventional and learned discretizations on arbitrary grids.
Looking ahead: In the next lectures, we'll explore how to apply these stability and accuracy principles to design physics-informed neural networks and learned discretizations that provably converge to PDE solutions.