A graph is the smallest mathematical object that captures the idea of "things and the connections between them." From that one definition flows the entire vocabulary of degree, walks, components, isomorphism — and a handful of recurring graph families that show up in every theorem you'll meet later.
15 min readPrereqs: Sets and set operationsUpdated 2026·05·17
What you'll leave with
The formal definition $G = (V, E)$ and how directed, multi-, weighted, and simple graphs fit inside it.
Degree, the handshake lemma, and why the number of odd-degree vertices is always even.
When to reach for an adjacency matrix vs an adjacency list — the trade-off is dense vs sparse.
The walk/trail/path/cycle distinction and what connectivity means in undirected and directed graphs.
Five equivalent definitions of a tree and why bipartite = "no odd cycle."
1. What a graph is
Graph
A graph is an ordered pair $G = (V, E)$ where $V$ is a finite set of vertices and $E$ is a set of edges, each edge an unordered pair $\{u, v\}$ of vertices.
We write $|V| = n$ for the order and $|E| = m$ for the size. Two vertices $u, v$ are adjacent if $\{u, v\} \in E$; an edge $e$ is incident to the two vertices it joins.
That is the whole abstract content. Everything else — direction, weights, loops, the special families, the great theorems — is layered on top of this one definition. A graph is a set of things plus a record of which pairs are related.
The point of the abstraction is reach. A road map (vertices = intersections, edges = road segments), a social network (people, friendships), a molecule (atoms, bonds), a dependency tree (tasks, "must precede"), and a circuit (terminals, wires) are all the same object once you squint past the labels. Any theorem about graphs is a theorem about all of these at once.
Note · drawings are not the graph
A graph is its vertex and edge sets — not the picture you draw of it. The same graph can be drawn many ways; two graphs that look completely different on paper can be the same graph. We'll make this precise in §9 on isomorphism.
2. Flavors: directed, multi-, weighted
The base definition gives a simple, undirected graph — at most one edge per pair, no edge from a vertex to itself. Four variations cover almost every graph you'll meet.
Flavor
What changes
Models
Simple
At most one edge per pair; no self-loops.
Friendship, co-authorship, plain road maps.
Multigraph
Multiple edges between the same pair allowed.
Königsberg bridges; double/triple chemical bonds; multi-lane roads.
Directed (digraph)
Edges are ordered pairs $(u, v)$, drawn as arrows.
Twitter follows, web links, task dependencies, one-way streets.
These flavors are independent and stack freely: a real road network is often a weighted multigraph with some directed edges. A pseudograph additionally allows self-loops $\{v, v\}$ — useful as a technical convenience but rarely the centerpiece.
Notation
For an undirected edge between $u$ and $v$, the set notation $\{u, v\}$ emphasizes that order doesn't matter: $\{u, v\} = \{v, u\}$. For a directed edge from $u$ to $v$, the tuple notation $(u, v)$ makes the asymmetry explicit: $(u, v) \neq (v, u)$.
3. Degree and the handshake lemma
The simplest invariant of a vertex is how many edges touch it.
Degree
The degree $\deg(v)$ of a vertex is the number of edges incident to $v$. In a pseudograph, a self-loop counts twice. The maximum and minimum degrees are written $\Delta(G)$ and $\delta(G)$. A graph in which every vertex has the same degree $k$ is $k$-regular.
For a directed graph, split into in-degree $\deg^{-}(v)$ (arrows pointing in) and out-degree $\deg^{+}(v)$ (arrows pointing out).
Now count edges two different ways. Walk through the graph and sum the degree of every vertex. Each edge $\{u, v\}$ touches exactly two vertices, so it gets counted exactly twice. That's the entire content of the most useful identity in graph theory.
Handshake lemma
For any (undirected) graph,
$$ \sum_{v \in V} \deg(v) \;=\; 2|E|. $$
Equivalently, the number of vertices of odd degree is always even.
The name comes from the obvious social fact: at any party, the number of people who have shaken an odd number of hands must be even. (Pair up the shakes: each handshake adds one to each of two counts. The total count is even; the odd-counted people can only come in pairs.)
For digraphs, the same counting argument splits cleanly: each arc adds one to its tail's out-degree and one to its head's in-degree, so
Whenever you write down a "degree sequence" — the list of vertex degrees — its sum must be even. A claimed graph with degree sequence $(3, 3, 3, 3, 3)$ sums to 15 and so cannot exist. This is the first test for whether a sequence is graphical.
4. Adjacency matrix vs adjacency list
To do anything algorithmic with a graph you have to store it. Two representations dominate, and the choice between them turns on one thing: density.
Adjacency matrix
Number the vertices $1, 2, \ldots, n$ and build the $n \times n$ matrix $A$ with
For an undirected simple graph $A$ is symmetric with zero diagonal. For a directed graph it need not be symmetric. For a weighted graph, store the weight $w(\{v_i, v_j\})$ instead of $1$.
Cost: $O(n^2)$ space, regardless of edge count. Constant-time "is there an edge between $i$ and $j$?" lookup. Iterating over the neighbors of one vertex requires scanning a whole row of length $n$, even if it has only one neighbor.
Adjacency list
For each vertex $v$, store a list of its neighbors $N(v) = \{u : \{u, v\} \in E\}$. Total storage is $O(n + m)$: one slot per vertex, two list-entries per undirected edge.
Cost: cheap neighbor iteration (you read exactly $\deg(v)$ entries), but checking "is $\{u, v\}$ an edge?" requires scanning $N(u)$ — $O(\deg(u))$ time, which is fine on average but slow in the worst case unless you back the list with a hash set.
When to reach for which
Matrix
List
Space
$O(n^2)$
$O(n + m)$
Edge query $\{u, v\} \in E$?
$O(1)$
$O(\deg(u))$
Iterate neighbors of $v$
$O(n)$
$O(\deg(v))$
Best when
$m \approx n^2$ (dense)
$m \ll n^2$ (sparse)
Real-world graphs — social networks, road maps, the web — are overwhelmingly sparse: each vertex has only a few dozen neighbors out of millions. Adjacency lists are the default for any graph with $n$ above a few thousand. Matrices come back into their own when you actually want matrix algebra: counting walks (the $k$-th power of $A$ counts walks of length $k$), spectral methods, or working on dense graphs where every pair is connected by default.
5. Walks, trails, paths, cycles
Almost every algorithmic question about graphs is about traversal — moving from vertex to vertex along edges. There is a small vocabulary of names distinguished by what is allowed to repeat.
Name
Repeated vertices?
Repeated edges?
Closed (ends where it started)?
Walk
Yes
Yes
Either
Trail
Yes
No
Either
Path
No
No (forced)
No
Circuit
Yes
No
Yes (closed trail)
Cycle
No (except start = end)
No
Yes
The length of a walk is the number of edges it traverses. The distance $d(u, v)$ between two vertices is the length of a shortest path between them (or $\infty$ if there's no path at all). The girth of a graph is the length of its shortest cycle.
Common confusion
Students often conflate "trail" and "path." The mnemonic: a trail can revisit places, but never the same road; a path can't even revisit places. Every path is a trail; not every trail is a path.
6. Connectivity and components
An undirected graph is connected if there is a path between every pair of vertices. A graph that isn't connected breaks into maximal connected pieces — its connected components. Every vertex lies in exactly one component; you can find them with a BFS or DFS from any unvisited starting point.
A graph with three components: a triangle, a path on three vertices, and an isolated vertex.
The directed case: strong vs weak
For a digraph the picture splits in two, because direction matters.
Strongly connected. For every ordered pair $(u, v)$, there is a directed path from $u$ to $v$. You can get anywhere from anywhere along the arrows.
Weakly connected. The underlying undirected graph (forget the arrowheads) is connected. You can get anywhere if you're allowed to walk against the flow.
Every strongly connected digraph is weakly connected; the converse fails. A directed path $a \to b \to c$ is weakly connected but not strongly — there's no way to reach $a$ from $c$. Like undirected components, a digraph decomposes into strongly connected components, and the way those components sit relative to one another always forms a directed acyclic graph (a DAG).
7. A gallery of special families
A handful of graphs show up over and over as extremes, archetypes, and test cases. Learn to recognize them by sight; every theorem in graph theory was first tested against this list.
Complete graph $K_5$
Every pair adjacent. $|E| = \binom{n}{2}$. $K_n$ is $(n-1)$-regular.
Path graph $P_5$
$n$ vertices in a line, $n-1$ edges. Endpoints have degree 1; interior degree 2.
Cycle graph $C_6$
$n$ vertices in a closed loop, $n$ edges. 2-regular. Bipartite iff $n$ is even.
Complete bipartite $K_{3,3}$
Two sides $A, B$; every $A$-vertex joined to every $B$-vertex. $|E| = mn$.
A tree on 7 vertices
Connected and acyclic. Exactly $n - 1$ edges. Unique path between any two vertices.
Star graph $K_{1,6}$
One central vertex joined to $n$ leaves. A tree, written $K_{1,n}$. Diameter 2.
Wheel graph $W_6$
A cycle $C_n$ plus a hub vertex connected to every rim vertex.
Each of these has a one-line summary worth memorizing:
$K_n$: every pair connected. As dense as a simple graph can get.
$P_n$: a single line of $n$ vertices, $n-1$ edges. The simplest connected graph.
$C_n$: $P_n$ wrapped into a loop. The simplest graph with a cycle.
$K_{m,n}$: two sides, every cross-pair connected. The archetypal bipartite graph.
Tree: connected and acyclic. The simplest hierarchical structure.
Star $K_{1,n}$: one hub, $n$ leaves. Maximum-diameter-2, minimum-edge-count graph spanning $n+1$ vertices.
Wheel $W_n$: cycle plus hub. The smallest interesting graph that is neither a tree nor cycle nor complete.
8. Trees
Trees are the most useful graphs in computer science, because they are the simplest graphs in which every pair of vertices is joined by a unique path. They model hierarchies, parse structures, dependency layouts, and any decision process that doesn't loop back on itself.
Tree (six equivalent definitions)
Let $G$ be a graph on $n$ vertices. The following are equivalent:
$G$ is connected and acyclic.
$G$ is connected and has exactly $n - 1$ edges.
$G$ is acyclic and has exactly $n - 1$ edges.
Every two vertices of $G$ are joined by a unique path.
$G$ is connected, but removing any edge disconnects it (every edge is a bridge).
$G$ is acyclic, but adding any new edge creates a cycle.
Any one of these is taken as the definition of a tree; the others then become theorems.
The "two of three" pattern in conditions 1–3 is worth absorbing. Knowing any two of connected, acyclic, and $n-1$ edges forces the third. Drop any one and you can have a graph that fails the others: a triangle has the right edge count for $n = 3$ but is cyclic; two disjoint edges on 4 vertices are acyclic with the wrong count and are disconnected.
Leaves, forests, rooted trees
A leaf is a vertex of degree 1. Every tree on at least two vertices has at least two leaves — an easy consequence of the handshake lemma and the edge count: degrees sum to $2(n-1) = 2n - 2$, but if fewer than two vertices had degree 1, the average degree would be at least $2(n-1)/n$, contradicting the bound.
A forest is an acyclic graph — equivalently, a disjoint union of trees. A forest with $n$ vertices and $c$ components has exactly $n - c$ edges.
A rooted tree is a tree together with one distinguished vertex called the root. The choice of root induces a parent–child direction on every edge: each non-root vertex has a unique parent (the next vertex along the path to the root), and children are everything else. The same underlying tree can be rooted at any of its $n$ vertices, producing $n$ different rooted trees with the same shape but different hierarchies. An unrooted tree is just a tree, considered as a graph with no special vertex.
Why uniqueness of paths matters
The fact that any two vertices of a tree are joined by exactly one path is what makes trees indispensable for hierarchies. File systems, syntax trees, decision trees, phylogenetic trees, and binary search trees all exploit it: there is one way to get from here to there, so the algorithm can't get confused by choices.
9. Subgraphs and isomorphism
Subgraphs
A subgraph of $G = (V, E)$ is a graph $H = (V', E')$ with $V' \subseteq V$ and $E' \subseteq E$ such that every edge in $E'$ has both endpoints in $V'$. Two flavors matter often enough to name:
Spanning subgraph: $V' = V$, but $E'$ may be smaller. A spanning tree of a connected graph is the standard example.
Induced subgraph $G[V']$: take $V' \subseteq V$, and let $E'$ be all edges of $G$ whose endpoints lie in $V'$. You don't get to pick — induced means you take whatever edges $G$ already has.
Isomorphism
Two graphs that look different on paper might be the same graph with relabeled vertices. The formal name for this sameness is isomorphism.
Graph isomorphism
Graphs $G$ and $H$ are isomorphic, written $G \cong H$, if there exists a bijection $\varphi \colon V(G) \to V(H)$ such that for every pair $u, v \in V(G)$,
$$ \{u, v\} \in E(G) \;\iff\; \{\varphi(u), \varphi(v)\} \in E(H). $$
That is, $\varphi$ matches vertices to vertices in a way that exactly preserves the adjacency relation.
Isomorphic graphs share every structural property: same number of vertices and edges, same degree sequence, same number of components, same diameter, same girth, same chromatic number, and so on. Any property that can be stated without naming specific vertices is an isomorphism invariant.
The converse fails. Two graphs can match on every invariant you can think of and still be non-isomorphic — invariants narrow the search but don't settle it. This is the heart of the graph isomorphism problem:
Given two graphs $G$ and $H$, decide whether $G \cong H$.
This problem is famously subtle. It clearly lies in NP — you can verify a proposed bijection in polynomial time. It is not known to be NP-complete, and it is not known to be in P. In 2015, László Babai gave a quasi-polynomial-time algorithm ($n^{O((\log n)^c)}$), the best known. Graph isomorphism is one of the most-studied "intermediate" problems in computational complexity.
Quick checks to rule out isomorphism
If you suspect two graphs aren't isomorphic, look for any invariant that differs: vertex count, edge count, degree sequence, number of triangles, diameter, connected components. A single mismatch settles it. Matching on all of these doesn't prove isomorphism, but mismatching even once disproves it.
10. Bipartite graphs and odd cycles
A graph is bipartite if its vertex set splits into two disjoint parts $A$ and $B$ such that every edge runs between $A$ and $B$ — never inside $A$, never inside $B$. The pieces $A$ and $B$ are called the parts of the bipartition.
Bipartiteness is the right model whenever the world has two kinds of things and a relation between kinds. Job applicants and jobs (with edges for "qualified for"). Students and courses (with edges for "enrolled in"). Customers and products (with edges for "purchased"). Sources and sinks. Bipartite structure is the precondition for matching algorithms — and matching is the backbone of assignment, scheduling, and network-flow problems.
The König characterization
There is an exact, purely structural test for bipartiteness, due to Dénes Kőnig:
König's two-coloring theorem
A graph $G$ is bipartite if and only if it contains no cycle of odd length.
One direction is immediate: if you have a valid 2-coloring of the vertices into $A$ and $B$, then any cycle must alternate colors as it traverses edges, and alternating colors $A, B, A, B, \ldots$ can only return to its start after an even number of steps.
The other direction is the substantive content. Given a graph with no odd cycle, you can 2-color it by BFS: pick a starting vertex, color it $A$, color its neighbors $B$, color their neighbors $A$, and so on. If two neighbors ever end up with the same color, you've found a conflict; trace it back through the BFS tree and you can extract an odd cycle. No odd cycle, no conflict, valid 2-coloring — done.
This is also the algorithm a computer uses to test bipartiteness: run BFS, two-color as you go, fail the moment you see two adjacent vertices with the same color. Running time $O(n + m)$ — as fast as just reading the graph.
Quick sight tests
The complete bipartite graphs $K_{m,n}$ are bipartite by construction. The cycle $C_n$ is bipartite iff $n$ is even. Trees are always bipartite (no cycles at all, so vacuously no odd cycles). Any graph that contains a triangle is not bipartite.
11. Common pitfalls
Walk, trail, path
Three words that get mixed up constantly. Walk allows everything. Trail bans repeated edges but allows repeated vertices. Path bans repeated vertices, which automatically bans repeated edges. When a textbook says "path" check whether they really mean "walk" — older European books sometimes use the looser convention.
Subgraph vs induced subgraph
A subgraph can drop edges freely; an induced subgraph must include every edge between its chosen vertices. Theorems about "forbidden induced subgraphs" become wrong if you slip and read "subgraph" — the constraint is much weaker.
Same degree sequence is not enough
Two graphs with identical degree sequences can still be non-isomorphic. Matching degree sequences is a necessary condition for isomorphism, never a sufficient one. Check more invariants — or build the bijection explicitly.
Strongly vs weakly connected (digraphs only)
For undirected graphs there is just one notion of connectedness. For digraphs there are two, and they are not the same — weak connectivity is much weaker. Any time you write "connected digraph," stop and ask which one you mean.
Self-loops break "degree = neighbor count"
In a simple graph $\deg(v) = |N(v)|$ exactly. In a pseudograph a self-loop $\{v, v\}$ contributes $2$ to $\deg(v)$ — so the handshake lemma keeps working. Don't quietly drop the factor of two when loops are around.
12. Worked examples
Example 1 · Can a graph have degree sequence $(5, 5, 4, 3, 2, 1)$?
Step 1. Sum the degrees: $5 + 5 + 4 + 3 + 2 + 1 = 20$. The handshake lemma requires the sum to be even — $20$ passes.
Step 2. Both maxima are $5$, and there are $6$ vertices. In a simple graph $\deg(v) \le n - 1 = 5$ for every vertex, so this is on the edge but not over.
Step 3. Try to build one. Vertices $a$ and $b$ each have degree $5$, so each must be adjacent to all five other vertices. That alone gives the degree-$1$ vertex two neighbors ($a$ and $b$), contradicting its degree.
Conclusion. No such simple graph exists. The handshake parity check passed but a structural obstruction killed it. Parity is necessary, never sufficient.
Example 2 · How many edges does the complete bipartite graph $K_{4, 7}$ have?
By definition every vertex on the size-$4$ side is connected to every vertex on the size-$7$ side, so
$$ |E(K_{4,7})| = 4 \cdot 7 = 28. $$
As a sanity check, every degree on the size-$4$ side is $7$ and every degree on the size-$7$ side is $4$, giving total degree $4 \cdot 7 + 7 \cdot 4 = 56 = 2 \cdot 28$. ✓
Example 3 · Is $C_7$ bipartite?
$C_7$ is itself a cycle of length $7$, which is odd. By König's theorem, a graph is bipartite iff it has no odd cycle — but here the entire graph is an odd cycle. So $C_7$ is not bipartite.
More generally, $C_n$ is bipartite iff $n$ is even.
Example 4 · A tree on $n$ vertices has how many edges?
Exactly $n - 1$. The proof is by induction on $n$. The base case $n = 1$ is a single isolated vertex with $0 = n - 1$ edges. For the inductive step, every tree on $n \ge 2$ vertices has a leaf $v$ (proved using the handshake lemma); remove $v$ and its single incident edge. The remainder is a tree on $n - 1$ vertices (still connected and acyclic), with $(n - 1) - 1$ edges by induction — so the original had one more, namely $n - 1$.
Example 5 · A graph has 4 components on 12 vertices. What's the maximum possible number of edges?
Edges within a component are bounded by $\binom{k}{2}$ where $k$ is the component's vertex count; edges between components don't exist (separate components, by definition, are not connected). To maximize total edges, we want to make components as unequal as possible: cram everything into one giant component.
With $4$ components and $12$ vertices, the most-unequal split is $9$ vertices in one component and three isolated vertices in the others. That gives a maximum of $\binom{9}{2} = 36$ edges.
(The split $3,3,3,3$ would give only $4 \cdot \binom{3}{2} = 12$ edges — much less.)
Sources & further reading
The content above synthesizes the foundational chapters of standard graph theory texts and the corresponding lecture material. For the deep dives — Kuratowski, Ramsey, spectral methods, algorithmic complexity — follow these to the source.
The standard graduate-level reference. Rigorous, encyclopedic; chapters 1–2 cover the entire content of this page and then some. Reach for this when you want a definition stated with full mathematical care.
Full lecture series and notes for MIT's discrete math course. Units 5–6 are devoted to graphs and trees with a computer-science slant — exactly the angle this page is written from.
Graph TheoryTextbookJ. A. Bondy & U. S. R. Murty · Springer GTM 244
The successor to Bondy & Murty's classic Graph Theory with Applications. Cleaner notation, modern coverage, excellent for the structural side (isomorphism, connectivity, matchings).
Dense, formal definitions of every term used here with cross-links to specialized articles ($K_n$, trees, isomorphism). Best when you want the canonical mathematical phrasing in a hurry.
Interactive problem-driven walkthrough of the basics. Best after you've read a textbook chapter — it drills intuition through worked exercises rather than presenting theory linearly.
A broad overview that surveys the variants (simple, multi, hyper, infinite) and connects to dozens of specialized articles. Good for placing this topic in the wider mathematical landscape and finding adjacent topics.
Test your understanding
A quiz that builds from easy to hard. Pick an answer to get instant feedback and a worked explanation. Your progress is saved in this browser — come back anytime to continue.