Lecture 12: Introduction to Finite Element Methods

Topics Covered: Transition from Finite Difference Methods (FDM) to Finite Element Methods (FEM), weak formulation and Galerkin methods, function spaces: $L^2$, $H^1$, $H_0^1$, basis functions and shape functions, assembly of stiffness matrices, Gaussian quadrature, connection to energy minimization, Galerkin orthogonality and optimal error estimates, handling Neumann boundary conditions

0. Overview

This lecture marks a pivotal transition in our course: we move from finite difference methods (FDM) to finite element methods (FEM). While FDM has served us well for periodic problems on regular grids, its limitations become apparent when dealing with complex geometries, non-periodic boundary conditions, and mesh refinement. FEM provides the flexibility needed for real-world engineering problems while maintaining the mathematical rigor we've developed.

Before making this transition, we pause to reflect on what FDM has given us: proofs of stability and conservation properties, convergence guarantees via the Lax equivalence theorem and polynomial reproduction, stable integration of Hamiltonian systems, and a framework for physics learning as nonlinear perturbations of well-understood problems. In engineering, the verification and validation (V&V) paradigm is fundamental: verification ensures code correctly implements the mathematical theory (math ⇒ good code), while validation ensures the code predicts experimental data (good code ⇔ good model).

The Galerkin approach at the heart of FEM takes a fundamentally different perspective. Instead of requiring solutions to be twice differentiable (as FDM does), we formulate problems in a weak sense that only requires first derivatives. This isn't a limitation—"weak" doesn't mean "bad"—but rather enables a broader class of solutions and more flexible discretizations. We'll see that FEM naturally connects to energy minimization (the Ritz method) and provides optimal error estimates through the beautiful property of Galerkin orthogonality.

1. Reflection on Finite Difference Methods

1.1 What We Can Do with FDM for ML

1.2 Verification and Validation in Engineering

In engineering, "V&V" (Verification & Validation) is the bedrock of trust in simulations:

1.3 What We Cannot Do with FDM

2. Historical Context

2.1 FEM History: Perspective Beyond the ML Hype

2.2 ML History for Comparison

Note: FEM has 60+ years of mathematical rigor and engineering validation—don't fall for the hype that ML invented scientific computing!

3. The Poisson Problem and Weak Formulation

3.1 Strong Form (Classical Formulation)

Problem with finite differences: Assumes too much regularity in the solution:

Note: "Weak" ≠ "bad" or "not strong"—actually enables a more general class of solutions.

3.2 The "Galerkin Cookbook" (Error Measurement Approach)

Step 1: Take an arbitrary test function $v$ with $v(0) = v(1) = 0$

Step 2: Multiply $(P)$ by $v$ and integrate:

$$-\int_0^1 u'' v \, dx = \int_0^1 f v \, dx$$

Step 3: Integrate by parts to get the least restrictive derivatives on $u$ and $v$ possible:

$$\int_0^1 u' v' \, dx - \underbrace{(u'(1) v(1) - u'(0) v(0))}_{= 0 \text{ by Dirichlet BC}} = \int_0^1 f v \, dx$$

3.3 Weak Formulation

(G) Find $u \in V$ such that for any $v \in V$:

$$\boxed{\int_0^1 u' v' \, dx = \int_0^1 f v \, dx}$$

Remarks:

There is a symmetry similar to what we've seen with $D_+, D_-$ in FDM
How do we choose the function space $V$?

4. Function Spaces for FEM

4.1 Choosing the Right Function Space

4.2 Useful Function Spaces

$L^2$ space (square-integrable functions):

$$L^2([0,1]) = \left\{ f : \int_0^1 f^2 \, dx < \infty \right\}$$

$H^1$ space (Sobolev space, first derivative in $L^2$):

$$H^1([0,1]) = \left\{ f \in L^2 : f' \in L^2 \right\}$$

$H_0^1$ space (functions in $H^1$ with zero boundary values):

$$H_0^1([0,1]) = \left\{ f \in H^1 : f(0) = f(1) = 0 \right\}$$

4.3 Well-Posedness Analysis

Boundary terms: $H_0^1 \Rightarrow$ Don't worry about boundary terms during integration by parts.

Important Note: This choice is specific to Dirichlet BCs. Different PDEs require different function spaces.

5. Making It "Finite": The Galerkin Approximation

5.1 Finite-Dimensional Subspace

Choose $V_h \subseteq V$ with $\dim(V_h) = N$ (i.e., pick a basis with $N$ shape functions).

5.2 Many Choices for Basis Functions

5.3 Piecewise Linear Basis Functions

Type	Example	Method Name
Discontinuous	Piecewise constant	Finite Volume
Continuous polynomial	Piecewise linear/quadratic	(Bubnov-)Galerkin
Trigonometric / Orthogonal	Fourier, Chebyshev, Legendre	Spectral Element
Non-polynomial	Gaussians, RBFs	Meshfree

Define "hat functions":

$$\phi_i(x) = \begin{cases} \frac{x - x_{i-1}}{x_i - x_{i-1}} & x \in e_i \\ 1 - \frac{x - x_i}{x_{i+1} - x_i} & x \in e_{i+1} \\ 0 & \text{otherwise} \end{cases}$$

Key property: $\phi_i(x_j) = \delta_{ij}$ (Kronecker delta)

This means the coefficients $\hat{u}_i$ are simply the nodal values: $\hat{u}_i = u(x_i)$.

6. Assembly of the Linear System

6.1 Discrete Weak Form

Equivalently: Find $\hat{u} \in \mathbb{R}^{N_{el}}$ such that $\forall \hat{v} \in \mathbb{R}^{N_{el}}$:

7. Uniqueness and Properties of the Stiffness Matrix

7.1 Theorem: FEM Discretization of (G) is Unique

Proof: Suppose $u_1, u_2$ both solve $(G)$:

$$\begin{aligned} \int_0^1 u_1' v' \, dx &= \int_0^1 f v \, dx \\ \int_0^1 u_2' v' \, dx &= \int_0^1 f v \, dx \end{aligned}$$

Subtract:

$$\int_0^1 (u_1 - u_2)' v' \, dx = 0 \quad \text{for any } v \in V_h$$

Pick $v = u_1 - u_2$:

$$\int_0^1 [(u_1 - u_2)']^2 \, dx = 0$$ $$\Rightarrow (u_1 - u_2)' = 0 \quad \text{pointwise}$$ $$\Rightarrow u_1 - u_2 = \text{constant}$$

Since $V_h \subseteq H_0^1$, we have $u_1 - u_2 = 0$ on the boundary.

$$\Rightarrow u_1 - u_2 = 0 \text{ everywhere} \quad \checkmark$$

7.2 Theorem: $S$ is Symmetric Positive Definite

Proof:

$$\begin{aligned} \hat{y}^\top S \hat{y} &= \sum_{i,j} \hat{y}_i \int_0^1 \phi_i' \phi_j' \, dx \, \hat{y}_j \\ &= \int_0^1 \left( \sum_i \hat{y}_i \phi_i' \right)^2 dx \\ &= \int_0^1 (y')^2 \, dx \geq 0 \end{aligned}$$

with equality only if $y = 0$.

Therefore: $S$ is SPD (symmetric positive definite), which guarantees unique solutions. $\quad \checkmark$

8. Sparse Matrix Structure

8.1 Sparsity Pattern

is only non-zero if $\operatorname{supp}(\phi_i) \cap \operatorname{supp}(\phi_j) \neq \emptyset$

Since $\phi_i$ has support only on $[x_{i-1}, x_{i+1}]$, most entries of $S$ are zero!

9. Gaussian Quadrature

9.1 Quadrature Rules

Definition: A quadrature rule is a set of $N_q$ points and weights:

$w_i$ = quadrature weights
$x_i$ = quadrature points

such that:

$$\int_{-1}^1 f(x) \, dx = \sum_{i=1}^{N_q} w_i f(x_i)$$

that is exact for some class of functions.

9.2 Gauss-Legendre Quadrature

9.3 Change of Variables

$N_q$	$x_i$	$w_i$
1	$0$	$2$
2	$\pm 1/\sqrt{3}$	$1, 1$
3	$0, \pm\sqrt{3/5}$	$8/9, 5/9, 5/9$

$u$-substitution:

$$u = 2\left(\frac{x - a}{b - a}\right) - 1$$ $$\begin{aligned} \int_a^b f(x) \, dx &= \int_{-1}^1 f\left(\frac{b-a}{2}u + \frac{a+b}{2}\right) \frac{b-a}{2} \, du \\ &= \frac{b-a}{2} \sum_{i=1}^{N_q} w_i f\left(\frac{b-a}{2}x_i + \frac{a+b}{2}\right) \end{aligned}$$

10. Connection to Energy Minimization

10.1 The Ritz Method

10.2 Equivalence to Galerkin Method

To solve, use the first variation:

Step 1: Compute variation:

$$(\delta_v F(v), \delta v) = (v', \delta v') - (f, \delta v)$$

Step 2: Set to zero for any $\delta v$:

$$\boxed{(v', \delta v') = (f, \delta v) \quad \text{for any } \delta v}$$

This is exactly the Galerkin formulation (G)!

Remark: Not all PDEs can be written as an energy minimization. But when they can, Galerkin = Ritz.

11. Galerkin Orthogonality and Optimal Error

11.1 Galerkin Orthogonality

11.2 Theorem: Optimal Error in $H^1$ Norm

11.3 Proof

Let $v \in V_h$ and $w = u_h - v \in V_h$.

Step 1: Expand norm:

$$\|(u - u_h)'\|^2 = ((u - u_h)', (u - u_h)')$$

Step 2: Add and subtract $(u - v)'$:

$$= ((u - u_h)', (u - v)') + \underbrace{((u - u_h)', w')}_{= 0 \text{ by Galerkin orthogonality}}$$

Step 3: Simplify:

$$= ((u - u_h)', (u - v)')$$

Step 4: Apply Cauchy-Schwarz:

$$\leq \|(u - u_h)'\| \cdot \|(u - v)'\|$$

Step 5: Divide both sides by $\|(u - u_h)'\|$:

$$\|(u - u_h)'\| \leq \|(u - v)'\| \quad \checkmark$$

This holds for ANY $v \in V_h$, so the FEM solution minimizes the error!

12. Handling Neumann Boundary Conditions

12.1 Problem Statement

12.2 Weak Form with Neumann BC

If $u'(1) = g$ (Neumann condition), substitute it in:

$$\boxed{\int_0^1 u' v' \, dx = \int_0^1 f v \, dx + g \cdot v(1)}$$

Key observation: Neumann BCs appear naturally in the weak formulation as boundary terms. We don't need to enforce them explicitly—they emerge from integration by parts!

Modified linear system:

$$S \hat{u} = b + g \cdot e_N$$

where $e_N$ is the vector with 1 in the last position (corresponding to $x = 1$) and 0 elsewhere.

Summary

This lecture covered:

Transition from FDM to FEM: motivation and historical context
Weak (Galerkin) formulation: reducing regularity requirements
Function spaces $L^2$, $H^1$, $H_0^1$ for well-posed weak problems
Finite-dimensional approximation with basis functions (shape functions)
Stiffness matrix $S$ and load vector $b$ assembly
Uniqueness and SPD property of the stiffness matrix
Sparsity structure for efficient computation
Gaussian quadrature for numerical integration
Equivalence between Galerkin method and energy minimization (Ritz method)
Galerkin orthogonality and optimal error estimates
Natural handling of Neumann boundary conditions

Key Takeaway: The finite element method provides a principled, flexible framework for solving PDEs on complex domains with various boundary conditions. By working in weak form and choosing appropriate function spaces, we obtain optimal approximations with provable error bounds. The Galerkin orthogonality property ensures that our discrete solution is the best possible projection of the true solution onto our finite-dimensional space. Moreover, the natural emergence of Neumann boundary conditions from the weak formulation, combined with the connection to energy minimization, reveals the deep geometric structure underlying FEM—a structure we'll exploit when building learnable PDE solvers.