1. From ODEs to PDEs
An ordinary differential equation is a rule about a function of one independent variable — usually time. Newton’s second law, written as $m\,\ddot{x}(t) = F(x, \dot{x}, t)$, sets up a story in which the universe has a single dial called time and the position $x$ obeys a rule that says what it does next.
A partial differential equation governs a function of several independent variables. The temperature $u(x, t)$ in a metal rod depends on both where you are on the rod and what time it is. The displacement $u(x, t)$ of a vibrating guitar string does too. The electric potential $u(x, y, z)$ in a room depends on three spatial coordinates and (in steady state) on none of time. The rule connecting these quantities to their rates of change isn’t a single ODE — it’s a relationship among partial derivatives.
An equation involving an unknown function $u$ of two or more independent variables together with one or more of its partial derivatives. The goal is to find $u$ as a function — not a single number, but a whole landscape over the variables on which it depends.
Three things make the PDE world structurally different from the ODE world:
- Solution spaces are bigger. An ODE often has solutions parameterized by a few constants. A PDE typically has solutions parameterized by entire functions — the initial temperature profile, the shape of a plucked string, the charge density in space.
- Boundaries become decisive. What happens on the edge of the domain is part of the problem. Without boundary data, a PDE has too many solutions to be useful.
- Closed-form solutions are rare. Outside a small zoo of model equations on simple geometries, you reach for series, integral representations, or numerics.
An ODE asks: given how this quantity is changing in time, what is the trajectory? A PDE asks: given how this quantity is changing in every direction at every point, what is the field?
2. Anatomy: order and linearity
Before classifying anything fancier, you read off two structural features of a PDE.
Order
The order of a PDE is the highest order of partial derivative that appears. The transport equation $u_t + c\, u_x = 0$ is first-order. The heat equation $u_t = \alpha u_{xx}$ is second-order, since $u_{xx}$ is a second derivative. The biharmonic equation $\Delta^2 u = 0$ is fourth-order.
Second-order is the workhorse: nearly every classical PDE you meet in introductory courses lives here, and the classification scheme of the next section is specifically a story about second-order linear PDEs.
Linearity
A PDE is linear if it can be written as $L[u] = f$ where the operator $L$ treats $u$ and its derivatives as a linear combination — no products of $u$ with itself, no $\sin(u)$, no $(u_x)^2$, no $u\, u_x$. It is homogeneous if $f = 0$, otherwise inhomogeneous.
Linearity is the property that buys you superposition: if $u_1$ and $u_2$ both solve the homogeneous equation, so does any combination $c_1 u_1 + c_2 u_2$. This is the entire reason series solutions — like the Fourier series we’ll arrive at — even make sense. Lose linearity and superposition collapses; you can no longer build solutions out of building blocks.
- $u_t = \alpha u_{xx}$ (heat)
- $u_{tt} = c^2 u_{xx}$ (wave)
- $u_{xx} + u_{yy} = 0$ (Laplace)
- $u_t + c\, u_x = 0$ (transport)
- $u_t + u\, u_x = 0$ (Burgers — product of $u$ and $u_x$)
- $u_t = u_{xx} + u^2$ (squared $u$)
- Navier–Stokes (the velocity field is dotted into its own gradient)
- Einstein field equations (curvature depends on the metric nonlinearly)
Before doing anything else with a PDE, check linearity. Linear means “there is a chance separation of variables, Fourier methods, or Green’s functions will work directly.” Nonlinear means “those tools either fail or only apply after a clever transformation, and the qualitative behaviour can be wild.”
3. Classification of second-order PDEs
Consider the most general second-order linear PDE in two variables:
$$ A\, u_{xx} + B\, u_{xy} + C\, u_{yy} + \text{(lower-order terms)} = f. $$The coefficients $A$, $B$, $C$ may depend on $x, y$, but not on $u$. The character of the equation is determined by a single number — the discriminant $B^2 - 4AC$ — exactly analogous to how the discriminant of a quadratic $Ax^2 + Bx + C$ tells you whether the conic $Ax^2 + Bxy + Cy^2 = \text{const}$ is an ellipse, parabola, or hyperbola. The names of the PDE classes come from that analogy.
| Discriminant | Class | Prototype | Physical character |
|---|---|---|---|
| $B^2 - 4AC < 0$ | Elliptic | Laplace $u_{xx} + u_{yy} = 0$ | Equilibrium / steady state. No preferred direction; the value at any point feels the boundary everywhere. |
| $B^2 - 4AC = 0$ | Parabolic | Heat $u_t - \alpha u_{xx} = 0$ | Diffusion. One-way arrow of time; profiles smooth out and irreversibly approach equilibrium. |
| $B^2 - 4AC > 0$ | Hyperbolic | Wave $u_{tt} - c^2 u_{xx} = 0$ | Propagation. Disturbances travel at finite speed along characteristic curves; behaviour is time-reversible. |
The remarkable fact is that this purely algebraic test on three coefficients tells you, without ever solving the equation, what kind of physics it can model. Elliptic equations describe settled situations; parabolic equations describe spreading; hyperbolic equations describe signaling.
For Laplace $u_{xx} + u_{yy} = 0$: $A = 1, B = 0, C = 1$, so $B^2 - 4AC = -4 < 0$ — elliptic. For heat with variables $(x, t)$: $A u_{xx} + B u_{xt} + C u_{tt}$ has $A = -\alpha, B = 0, C = 0$ (the $u_t$ is lower-order), so $B^2 - 4AC = 0$ — parabolic. For wave $u_{tt} - c^2 u_{xx} = 0$: $A = -c^2, B = 0, C = 1$, so $B^2 - 4AC = 4c^2 > 0$ — hyperbolic.
4. The three canonical PDEs
Almost everything you do in a first PDE course revolves around three model equations. Each is the simplest representative of its class and the one whose solution methods generalize.
The heat equation (parabolic)
$$ u_t = \alpha\, u_{xx} $$$u(x, t)$ is the temperature at position $x$ and time $t$; $\alpha > 0$ is the thermal diffusivity. The equation says the temperature at a point increases when its spatial profile is concave up there, and decreases when it’s concave down — that’s exactly what heat moving from hot regions to cold regions looks like locally.
The same equation appears, with different names, wherever a quantity diffuses: probability density (Fokker–Planck), pollutant concentration, a Brownian particle’s distribution. In every guise the behaviour is the same: local averaging. Sharp features blur out, oscillations decay, every initial condition eventually relaxes to its mean.
The wave equation (hyperbolic)
$$ u_{tt} = c^2\, u_{xx} $$$u(x, t)$ is the displacement of a string, the pressure deviation in a sound wave, or one component of an electromagnetic field; $c$ is the wave speed. d’Alembert’s solution shows that on an infinite line every solution is a sum of two travelling waves,
$$ u(x, t) = F(x - ct) + G(x + ct), $$one moving right and one moving left. Disturbances propagate at speed $c$ and only at speed $c$ — nothing instantaneous, nothing dissipative. The wave equation conserves energy; the heat equation destroys it.
The Laplace equation (elliptic)
$$ \Delta u = u_{xx} + u_{yy} = 0 $$(Or with more spatial dimensions, $\Delta u = u_{xx} + u_{yy} + u_{zz} = 0$.) There is no time. $u$ is a steady-state field: temperature in a room that’s reached thermal equilibrium, electrostatic potential in a charge-free region, velocity potential of an incompressible inviscid flow. The defining property is the mean value property: $u$ at the centre of any small ball equals its average over the boundary of that ball. Equilibrium = local averaging at every point.
One way to see all three at once: imagine running the heat equation forward in time. As $t \to \infty$, $u_t \to 0$, and what remains is $\alpha u_{xx} = 0$ — the steady-state Laplace equation. Laplace is the asymptotic limit of heat. And while heat damps oscillations away, the wave equation lets them propagate forever.
Run the same picture for $u_{tt} = c^2 u_{xx}$ on the line and the bump doesn’t smear — it splits into two identical bumps, one travelling left and one travelling right at speed $c$, each retaining its shape forever. That’s the hyperbolic signature: propagation without dissipation.
5. Boundary and initial conditions
Write down the heat equation $u_t = \alpha u_{xx}$ and ask “what is $u$?” The honest answer is: you haven’t asked a question yet. There are infinitely many functions satisfying the PDE; you’ve described a rule, not a problem.
To pin down a unique solution you need side data:
- Initial conditions say what $u$ looks like at $t = 0$. For the heat equation: one condition $u(x, 0) = f(x)$. For the wave equation, which is second-order in $t$: two conditions, $u(x, 0) = f(x)$ and $u_t(x, 0) = g(x)$ (position and velocity).
- Boundary conditions say what $u$ is doing on the edges of the spatial domain — the ends of the rod, the rim of the membrane, the walls of the room.
The choice of how you constrain $u$ on the boundary comes in three standard flavours.
Dirichlet, Neumann, Robin
| Type | Form on boundary | Physical meaning |
|---|---|---|
| Dirichlet | $u = g$ specified | The value of $u$ is fixed on the boundary — ice baths at the ends of the rod ($u = 0$), the string clamped at the ends, the potential held at a known voltage. |
| Neumann | $\partial_n u = g$ specified | The flux is fixed — an insulated rod end ($u_x = 0$, no heat flows), a free string end, a wall with prescribed electric field component. |
| Robin | $\alpha\, u + \beta\, \partial_n u = g$ | A mix: heat lost to the surroundings at a rate proportional to the temperature difference (Newton’s law of cooling); a partially absorbing wall. |
(Here $\partial_n u$ is the derivative of $u$ in the direction normal to the boundary, pointing outward.)
The amount of data each class of PDE needs is dictated by its character:
- Elliptic equations (Laplace, Poisson) need boundary conditions on the entire boundary of the spatial region. No initial condition — there’s no time.
- Parabolic equations (heat) need one initial condition plus boundary conditions on the spatial edges for all $t > 0$.
- Hyperbolic equations (wave) need two initial conditions (position and velocity) plus boundary conditions.
A problem is well-posed (in Hadamard’s sense) when (1) a solution exists, (2) it is unique, and (3) it depends continuously on the data — a tiny perturbation of the initial or boundary values produces only a tiny change in the solution. Match the right kind and amount of data to the PDE class and the problem is well-posed; mismatch and the problem can fail any of the three criteria silently.
6. Separation of variables
There is a small set of techniques that anchors a first course in PDEs — separation of variables, Fourier series and transforms, characteristics, Green’s functions, energy methods, similarity solutions. Of these, separation of variables is the one to learn first. It dissolves a PDE into a pair of ODEs and lets you reuse everything you know from the ODE world.
The leap of faith is to guess that $u(x, t)$ factors,
$$ u(x, t) = X(x)\, T(t), $$and check whether this guess can satisfy the equation. Substitute into a linear PDE and the spatial and temporal pieces try to pull apart. If after dividing by $X T$ you can collect every $x$-dependent thing on one side and every $t$-dependent thing on the other, you’re in business: both sides must equal the same constant (because a function of $x$ alone can only equal a function of $t$ alone if both are constant). That constant — called the separation constant, traditionally written $-\lambda$ — couples the two ODEs.
You then solve the ODE for $X$ subject to the spatial boundary conditions. This is an eigenvalue problem: only certain values of $\lambda$ admit nonzero solutions, and those values are the eigenvalues, with corresponding eigenfunctions $X_n(x)$. Each eigenvalue feeds into the ODE for $T$, giving a $T_n(t)$. The pair $X_n(x) T_n(t)$ is one solution — a mode. Because the PDE is linear, you can superpose modes, and the initial condition is satisfied by choosing the coefficients in that superposition. That last step is what calls in Fourier series.
Separation of variables works cleanly when the PDE is linear and homogeneous, the spatial domain is a simple shape (interval, rectangle, disk, sphere), and the boundary conditions are also linear and homogeneous on a coordinate-aligned boundary. When those align, the eigenvalue problem is self-adjoint and the eigenfunctions form a complete orthogonal basis — which is exactly why the final Fourier sum can represent any reasonable initial condition.
7. Walkthrough: heat on $[0, L]$ with Dirichlet ends
Here is the canonical run, end-to-end. The setup:
$$ \begin{aligned} u_t &= \alpha\, u_{xx}, && 0 < x < L,\; t > 0, \\ u(0, t) &= 0, \quad u(L, t) = 0, && t > 0, \\ u(x, 0) &= f(x), && 0 \le x \le L. \end{aligned} $$Physically: a rod of length $L$, both ends held at zero temperature, an arbitrary initial heat distribution $f(x)$. What does $u(x, t)$ do?
Step 1. Assume a separated form
Try $u(x, t) = X(x)\, T(t)$. Substituting into the PDE:
$$ X(x)\, T'(t) = \alpha\, X''(x)\, T(t). $$Divide both sides by $\alpha\, X(x)\, T(t)$ (assume neither factor is identically zero):
$$ \frac{T'(t)}{\alpha\, T(t)} = \frac{X''(x)}{X(x)}. $$Step 2. Separate
The left side depends only on $t$, the right only on $x$. The only way two such functions can be equal is if both are the same constant. Call it $-\lambda$ (the minus sign is convention, chosen so the eigenvalues come out positive in this problem):
$$ \frac{T'(t)}{\alpha\, T(t)} = \frac{X''(x)}{X(x)} = -\lambda. $$This splits the PDE into two ODEs:
$$ \begin{aligned} X''(x) + \lambda\, X(x) &= 0, \\ T'(t) + \alpha\, \lambda\, T(t) &= 0. \end{aligned} $$Step 3. Solve the spatial eigenvalue problem
The boundary conditions $u(0, t) = 0$ and $u(L, t) = 0$ translate, since $T(t)$ isn’t identically zero, into $X(0) = 0$ and $X(L) = 0$. So we want nonzero solutions of
$$ X'' + \lambda X = 0, \quad X(0) = X(L) = 0. $$For $\lambda ≤ 0$ the only solution satisfying both endpoint conditions is $X \equiv 0$ — uninteresting. For $\lambda > 0$, write $\lambda = k^2$ with $k > 0$ and the general solution is $X(x) = A \cos(kx) + B \sin(kx)$. The condition $X(0) = 0$ kills $A$, leaving $X(x) = B \sin(kx)$. The condition $X(L) = 0$ then forces $\sin(kL) = 0$, i.e. $kL = n\pi$ for some positive integer $n$. So the eigenvalues and eigenfunctions are
$$ \lambda_n = \left(\frac{n\pi}{L}\right)^2, \qquad X_n(x) = \sin\!\left(\frac{n\pi x}{L}\right), \quad n = 1, 2, 3, \ldots $$These are the modes the boundary conditions admit — the standing-wave shapes that vanish at both ends.
Step 4. Solve the temporal ODE
For each $\lambda_n$, the $T$ equation $T' + \alpha \lambda_n T = 0$ is a first-order linear ODE with solution
$$ T_n(t) = C_n\, e^{-\alpha \lambda_n t} = C_n\, e^{-\alpha (n\pi/L)^2 t}. $$Higher modes (larger $n$) decay faster, because $\lambda_n$ scales like $n^2$. That’s the heat equation’s most physical fingerprint: fine spatial features die quickly; broad ones linger.
Step 5. Superpose
Each pair $X_n T_n$ is a solution of the PDE that satisfies the boundary conditions. By linearity, any sum is too:
$$ u(x, t) = \sum_{n=1}^{\infty} B_n \sin\!\left(\frac{n\pi x}{L}\right) e^{-\alpha (n\pi/L)^2 t}. $$(The constants $C_n$ from step 4 have been absorbed into $B_n$.)
Step 6. Match the initial condition
Setting $t = 0$ in the series gives
$$ u(x, 0) = \sum_{n=1}^{\infty} B_n \sin\!\left(\frac{n\pi x}{L}\right) = f(x). $$We need to choose the coefficients $B_n$ so this sum reproduces $f(x)$ on $[0, L]$. This is exactly the problem of expanding $f$ in a Fourier sine series. The answer, which we’ll just quote here, is
$$ B_n = \frac{2}{L} \int_0^{L} f(x) \sin\!\left(\frac{n\pi x}{L}\right) dx. $$And that’s the full solution. Plug $f$ in, compute the integrals, plug the $B_n$ back into the sum.
Read the formula structurally. Each mode $n$ contributes a spatial shape $\sin(n\pi x/L)$ whose amplitude decays in time like $e^{-\alpha (n\pi/L)^2 t}$. Doubling $n$ quadruples the decay rate. So no matter how jagged $f(x)$ starts, after a short time only the lowest few modes survive in any meaningful amount, and after a long time only the slowest one matters — the whole rod relaxes into the gentlest standing-wave shape that fits between the boundary conditions, and from there to zero.
8. The Fourier connection
Notice what just happened in step 6. To finish the heat problem, we had to represent an arbitrary function $f(x)$ on $[0, L]$ as an infinite linear combination of the eigenfunctions $\sin(n\pi x/L)$. The claim that this can always be done — and the formula for the coefficients — is the content of Fourier series.
The right way to see why this works is that the eigenfunctions $\{\sin(n\pi x/L)\}_{n=1}^{\infty}$ are orthogonal with respect to the natural inner product on $[0, L]$:
$$ \int_0^{L} \sin\!\left(\frac{m\pi x}{L}\right) \sin\!\left(\frac{n\pi x}{L}\right) dx = \begin{cases} 0 & m \neq n, \\ L/2 & m = n. \end{cases} $$So multiplying the series for $f$ by $\sin(m\pi x/L)$ and integrating extracts $B_m$ alone — every other term integrates to zero. The factor $2/L$ in the coefficient formula is just $1$ divided by that diagonal value $L/2$.
This is not a coincidence and not a trick. The eigenvalue problem $X'' + \lambda X = 0$ with these boundary conditions is self-adjoint, and a general theorem (Sturm–Liouville theory) guarantees that the eigenfunctions of a self-adjoint problem form a complete orthogonal basis for reasonable functions on the domain. Different PDE / geometry / boundary combinations produce different bases — Fourier cosine series, full Fourier series, Bessel function expansions, Legendre polynomials, spherical harmonics — but the structure is always the same.
Separation of variables on a linear PDE in a simple geometry doesn’t just use Fourier series — it’s where they come from.
9. Common pitfalls
Writing “solve the heat equation” on its own isn’t a complete question. The PDE has infinitely many solutions; only specifying boundary and initial data narrows the set to one.
Choosing $+\lambda$ instead of $-\lambda$ doesn’t change the math — you’ll get the same answer in the end — but it flips which sign of $\lambda$ gives bounded solutions. The convention $-\lambda$ for the heat / Laplace eigenvalue problems is chosen so the physically relevant eigenvalues come out positive. If your $\lambda$’s are coming out negative, suspect a sign convention, not a contradiction.
The elliptic / parabolic / hyperbolic test is a statement about second-order linear PDEs. For nonlinear equations the classification is at best local (it can change region to region) and at worst not meaningful at all. Don’t apply the discriminant to Navier–Stokes and expect a global label.
For the canonical equations on simple geometries with friendly boundary conditions, you can get a series solution. Move to a less symmetric domain, a nonlinearity, or non-constant coefficients and closed-form solutions vanish. “Solving a PDE” in practice often means: prove a solution exists, prove it’s unique, characterize its qualitative behaviour, and compute it numerically. Don’t treat the absence of a formula as the absence of a solution.
10. Worked examples
Try each one before opening the solution.
Example 1 · Classify $u_{xx} - 4 u_{xy} + 4 u_{yy} + u_x = 0$.
Read off $A = 1$, $B = -4$, $C = 4$ (the $u_x$ is lower-order and doesn’t enter the classification). The discriminant is
$$ B^2 - 4AC = 16 - 16 = 0. $$So the equation is parabolic. (A coordinate change can in fact transform it into the heat equation up to lower-order terms.)
Example 2 · Classify $u_{tt} - u_{xx} - u_{yy} = 0$ in the variables $(x, y, t)$.
Group the second derivatives. In two spatial variables plus time the second-order part is $u_{tt} - (u_{xx} + u_{yy}) = u_{tt} - \Delta_{\text{space}} u$. This is the 2-D wave equation: hyperbolic. Disturbances propagate at unit speed in all spatial directions; you need initial position $u(x, y, 0)$ and initial velocity $u_t(x, y, 0)$ plus boundary conditions on the spatial domain.
Example 3 · Heat equation on $[0, L]$ with initial profile $f(x) = \sin(\pi x / L)$.
The initial condition is itself a single eigenfunction — the $n = 1$ mode — so the Fourier series collapses to one term.
The coefficient $B_1$ comes out of
$$ B_1 = \frac{2}{L} \int_0^L \sin\!\left(\frac{\pi x}{L}\right) \sin\!\left(\frac{\pi x}{L}\right) dx = \frac{2}{L} \cdot \frac{L}{2} = 1, $$and every other $B_n$ is zero by orthogonality. So
$$ u(x, t) = \sin\!\left(\frac{\pi x}{L}\right) e^{-\alpha (\pi/L)^2 t}. $$The shape doesn’t change — only the amplitude decays exponentially. A single mode is an eigenstate of the time evolution.
Example 4 · Why a Neumann condition $u_x(0, t) = u_x(L, t) = 0$ changes the eigenfunctions.
The spatial ODE is the same: $X'' + \lambda X = 0$. But the boundary conditions on $X$ become $X'(0) = X'(L) = 0$. The general solution $X = A\cos(kx) + B\sin(kx)$ then forces $B = 0$ (from $X'(0) = 0$) and $\sin(kL) = 0$, giving $k = n\pi/L$ for $n = 0, 1, 2, \ldots$. The eigenfunctions are cosines, not sines:
$$ X_n(x) = \cos\!\left(\frac{n\pi x}{L}\right), \quad n = 0, 1, 2, \ldots $$Note especially the $n = 0$ mode: $X_0 \equiv 1$, with eigenvalue $\lambda_0 = 0$, decaying as $e^0 = 1$ — it doesn’t decay at all. With insulated ends, the rod conserves heat, and the long-time behaviour is the spatial average of $f$, not zero.