1. The eight axioms
A vector space starts with two ingredients and a list of rules.
A set $V$ equipped with two operations
$+ \colon V \times V \to V \qquad \cdot \colon F \times V \to V$
(vector addition and scalar multiplication) such that the eight axioms below hold for every $\mathbf{u}, \mathbf{v}, \mathbf{w} \in V$ and every $a, b \in F$.
The field $F$ is almost always $\mathbb{R}$ or $\mathbb{C}$ in practice — think of it as "the kind of number you're allowed to multiply by." The set $V$ is the home of the vectors. The eight axioms below say nothing about what the vectors are; they only constrain how the two operations behave.
$(a + b)\mathbf{v} = a\mathbf{v} + b\mathbf{v}$
Some textbooks list ten — they split closure into "+ stays in $V$" and "$\cdot$ stays in $V$," and split distributivity into its two halves. Others list as few as four (treating $(V, +)$ as an abelian group up front). The content is identical; only the bookkeeping changes. We use the eight-axiom convention because it's the one Axler, Strang, and most undergraduate texts adopt.
What the axioms don't mention is just as telling. No length. No angle. No notion of "between." Those structures live one level up, in inner product spaces. A vector space alone gives you arithmetic: a way to combine things linearly. That's it — and that's enough to build most of linear algebra.
2. A zoo of vector spaces
The reason the axioms are powerful is that they're satisfied by an enormous variety of objects. Every theorem proved "for a general vector space $V$" applies to all of them at once.
| The space | A "vector" is | How to add | How to scale | $\dim$ |
|---|---|---|---|---|
| $\mathbb{R}^n$ | an $n$-tuple of reals | componentwise | componentwise | $n$ |
| $M_{m \times n}(\mathbb{R})$ | an $m \times n$ matrix | entrywise | entrywise | $mn$ |
| $P_n$ | a polynomial of degree $\leq n$ | add coefficients | scale coefficients | $n + 1$ |
| $\mathbb{R}[x]$ | any polynomial | add coefficients | scale coefficients | $\infty$ |
| $C[a, b]$ | a continuous function on $[a,b]$ | $(f+g)(x) = f(x) + g(x)$ | $(c f)(x) = c \, f(x)$ | $\infty$ |
| $\{y : y'' + y = 0\}$ | a solution of an ODE | sum of solutions | scalar multiple of a solution | $2$ |
| $\{\mathbf{0}\}$ | just $\mathbf{0}$ | $\mathbf{0} + \mathbf{0} = \mathbf{0}$ | $c \cdot \mathbf{0} = \mathbf{0}$ | $0$ |
Read that table again with fresh eyes. The same eight axioms describe a tuple of numbers, a square array of numbers, a function from $[0,1]$ to $\mathbb{R}$, and the family of curves solving an ODE. The unifying claim of linear algebra is: once you've checked the axioms, you've earned every theorem about vector spaces — for free, on objects that look nothing like arrows.
Take a setting you already understand for $\mathbb{R}^n$ — say, "the solutions of $A\mathbf{x} = \mathbf{0}$ form a subspace." Now replace "tuple" with "polynomial," "function," or "matrix." Most of what you knew still works, and the parts that don't are usually surfacing genuine new structure (like infinite-dimensionality) rather than breaking the framework.
3. Subspaces
A subspace is a vector space sitting inside another vector space. Formally:
A subset $W \subseteq V$ is a subspace when, with the addition and scalar multiplication inherited from $V$, it is itself a vector space.
You don't have to re-check all eight axioms. Inheritance does most of the work — associativity, commutativity, distributivity are automatic. What you do have to verify is that $W$ is closed under the two operations and contains the zero vector. This collapses to a three-point test:
(i) $\mathbf{0} \in W$ — the zero vector of $V$ lies in $W$.
(ii) If $\mathbf{u}, \mathbf{v} \in W$, then $\mathbf{u} + \mathbf{v} \in W$ — closed under addition.
(iii) If $\mathbf{v} \in W$ and $c \in F$, then $c\mathbf{v} \in W$ — closed under scalar multiplication.
Examples and non-examples
- $\{\mathbf{0}\}$ — the trivial subspace
- Any line through the origin, e.g. $\{(t, 2t)\}$
- All of $\mathbb{R}^2$
- $\{(x, y) : x + y = 0\}$ — passes through $\mathbf{0}$
- $\{(x, y) : x + y = 1\}$ — doesn't contain $\mathbf{0}$
- $\{(x, y) : x, y \geq 0\}$ — not closed under $-1$
- $\{(x, y) : xy = 0\}$ — $(1,0) + (0,1) = (1,1)$ escapes
- The unit circle $\{x^2 + y^2 = 1\}$ — not even close
Geometrically, the subspaces of $\mathbb{R}^n$ are exactly the linear flats through the origin: the origin itself, lines through it, planes through it, and so on up to all of $\mathbb{R}^n$. Anything curved, anything offset from the origin, anything one-sided — disqualified.
4. Span and linear independence
Once you have a handful of vectors, you can build new ones by combining them.
A linear combination of $\mathbf{v}_1, \ldots, \mathbf{v}_k$ is any vector of the form
$c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_k \mathbf{v}_k$
with scalars $c_1, \ldots, c_k \in F$.
The set of all linear combinations is called the span:
$$ \operatorname{span}(\mathbf{v}_1, \ldots, \mathbf{v}_k) = \left\{ \sum_{i=1}^{k} c_i \mathbf{v}_i : c_i \in F \right\} $$The span is always a subspace — in fact, it's the smallest subspace that contains all the $\mathbf{v}_i$'s. Adding more vectors can only enlarge the span (or leave it alone, if a new vector was already reachable).
The natural question follows: when does adding a vector actually enlarge the span, and when is it redundant?
The vectors $\mathbf{v}_1, \ldots, \mathbf{v}_k$ are linearly independent if the only way to write
$c_1 \mathbf{v}_1 + \cdots + c_k \mathbf{v}_k = \mathbf{0}$
is to take every $c_i = 0$. If any nonzero choice of scalars produces $\mathbf{0}$, the vectors are linearly dependent.
Dependence is the algebraic way of saying "one of these is redundant." If $c_1 \mathbf{v}_1 + \cdots + c_k \mathbf{v}_k = \mathbf{0}$ with some $c_j \neq 0$, you can solve for $\mathbf{v}_j$ as a combination of the others — so dropping it doesn't shrink the span.
How to test it concretely
In $\mathbb{R}^n$, the test is mechanical. Stack your candidate vectors as the columns of a matrix $A$, then ask: does $A\mathbf{c} = \mathbf{0}$ have only the trivial solution $\mathbf{c} = \mathbf{0}$?
- Square matrix: independent iff $\det A \neq 0$.
- Any matrix: independent iff $A$ has full column rank (every column is a pivot column after row reduction).
- More vectors than the ambient dimension: always dependent. $k > n$ vectors in $\mathbb{R}^n$ cannot be independent.
Two vectors in $\mathbb{R}^2$ are dependent iff they lie on the same line through the origin. Three vectors in $\mathbb{R}^3$ are dependent iff they lie in a common plane through the origin. Dependence is the algebraic shadow of "flat against a lower-dimensional subspace."
5. Basis and dimension
A basis is the Goldilocks set: just enough vectors to reach everything, with none to spare.
A basis of $V$ is a set $\{\mathbf{b}_1, \ldots, \mathbf{b}_n\}$ that is
(1) linearly independent, and
(2) spans $V$.
Equivalently — and this is the version that does most of the heavy lifting later — a basis is a set such that every $\mathbf{v} \in V$ can be written as a linear combination of the basis vectors in exactly one way. Existence comes from spanning; uniqueness comes from independence.
The dimension theorem
A vector space has many different bases. The standard basis of $\mathbb{R}^2$ is $\{(1,0), (0,1)\}$, but $\{(1,1), (1,-1)\}$ is a basis too, and so is $\{(3, 7), (-2, 5)\}$, and so are infinitely many others. Yet every one of them has the same number of elements.
If $V$ has a finite basis, then every basis of $V$ has the same number of vectors. That common number is the dimension of $V$, written $\dim V$.
The proof rests on the exchange lemma: given a spanning set $S$ and an independent set $I$, you can swap the vectors of $I$ into $S$ one at a time without losing the spanning property. The conclusion is that $|I| \leq |S|$. Apply this to two bases $B_1$ and $B_2$ in both directions and you get $|B_1| \leq |B_2|$ and $|B_2| \leq |B_1|$ — so the sizes are equal.
This is not a curiosity. It's what makes "dimension" a meaningful, intrinsic property of the space rather than an artifact of how you chose to coordinate it. Two finite-dimensional vector spaces over the same field are isomorphic if and only if they have the same dimension.
| Space | A natural basis | $\dim$ |
|---|---|---|
| $\mathbb{R}^n$ | $\{\mathbf{e}_1, \ldots, \mathbf{e}_n\}$ — standard basis | $n$ |
| $M_{2 \times 2}(\mathbb{R})$ | $\{E_{11}, E_{12}, E_{21}, E_{22}\}$ — one 1 per slot | $4$ |
| $P_n$ | $\{1, x, x^2, \ldots, x^n\}$ — monomials | $n + 1$ |
| $\{y : y'' + y = 0\}$ | $\{\sin x, \cos x\}$ | $2$ |
| $\{\mathbf{0}\}$ | $\emptyset$ — the empty basis | $0$ |
6. Coordinates and change of basis
Pick a basis $B = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}$ of $V$. Every vector $\mathbf{v}$ then has a unique tuple of scalars $(c_1, \ldots, c_n)$ such that
$$ \mathbf{v} = c_1 \mathbf{b}_1 + c_2 \mathbf{b}_2 + \cdots + c_n \mathbf{b}_n. $$That tuple is the coordinate vector of $\mathbf{v}$ in basis $B$, written
$$ [\mathbf{v}]_B = \begin{pmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{pmatrix}. $$Coordinates are how an abstract vector becomes a concrete column of numbers you can compute with. The catch — and the source of most beginner confusion — is that the column depends on the basis. The same $\mathbf{v}$ has different coordinates in different bases.
The change-of-basis matrix
Suppose $B$ and $C$ are two bases of $V$. To convert coordinates from $B$ to $C$, write each $B$-vector in terms of $C$ and stack those columns:
$$ P_{C \leftarrow B} = \Big[\, [\mathbf{b}_1]_C \;\;|\;\; [\mathbf{b}_2]_C \;\;|\;\; \cdots \;\;|\;\; [\mathbf{b}_n]_C \,\Big]. $$Then for any vector $\mathbf{v}$,
$$ [\mathbf{v}]_C = P_{C \leftarrow B} \, [\mathbf{v}]_B. $$The matrix is always invertible (its columns are an independent set, by construction), and the inverse swaps the roles: $P_{B \leftarrow C} = (P_{C \leftarrow B})^{-1}$.
A surprising number of bugs come from getting this backwards. Mnemonic: "columns of $P$ are old-basis vectors written in the new basis; $P$ then carries old-coordinates to new-coordinates." Or: $P_{\text{new} \leftarrow \text{old}}$ — the subscripts point in the direction of the conversion.
7. Two bases of $\mathbb{R}^2$, one vector
Here is the same vector $\mathbf{v} = (4, 3)$ expressed in two different bases — the standard basis $E = \{\mathbf{e}_1, \mathbf{e}_2\}$ and a rotated basis $B = \{\mathbf{b}_1, \mathbf{b}_2\}$ with $\mathbf{b}_1 = (2, 1)$ and $\mathbf{b}_2 = (-1, 2)$.
In the standard basis, $\mathbf{v}$ has coordinates $[\mathbf{v}]_E = (4, 3)$ — that's just what "standard" means. In the new basis, we solve $c_1 \mathbf{b}_1 + c_2 \mathbf{b}_2 = \mathbf{v}$:
$$ \begin{aligned} 2c_1 - c_2 &= 4 \\ \phantom{2}c_1 + 2c_2 &= 3 \end{aligned} \quad\Longrightarrow\quad c_1 = \frac{11}{5},\; c_2 = \frac{2}{5}. $$So $[\mathbf{v}]_B = (11/5,\, 2/5)$. The vector hasn't moved — only the addresses you gave it.
The change-of-basis matrix in this picture is
$$ P_{E \leftarrow B} = \begin{pmatrix} 2 & -1 \\ 1 & 2 \end{pmatrix} \quad\Longrightarrow\quad P_{B \leftarrow E} = \frac{1}{5}\begin{pmatrix} 2 & 1 \\ -1 & 2 \end{pmatrix}, $$and you can verify the conversion directly: $\frac{1}{5}\begin{pmatrix} 2 & 1 \\ -1 & 2 \end{pmatrix}\begin{pmatrix}4\\3\end{pmatrix} = \begin{pmatrix}11/5\\2/5\end{pmatrix}$. The vector is invariant; its coordinates rotate.
8. The four fundamental subspaces
Every matrix $A \in M_{m \times n}(\mathbb{R})$ carries four subspaces with it — two living in $\mathbb{R}^n$ (the domain), and two in $\mathbb{R}^m$ (the codomain). Strang puts these at the center of his linear-algebra story for a reason: they answer "what can $A$ produce?" and "what does $A$ destroy?" at the same time.
| Subspace | Lives in | Definition | What it tells you |
|---|---|---|---|
| Column space $\operatorname{col}(A) = C(A)$ | $\mathbb{R}^m$ | Span of the columns of $A$. Equivalently $\{A\mathbf{x} : \mathbf{x} \in \mathbb{R}^n\}$ — the image. | Which right-hand sides $\mathbf{b}$ make $A\mathbf{x} = \mathbf{b}$ solvable. |
| Null space $\operatorname{null}(A) = N(A)$ | $\mathbb{R}^n$ | $\{\mathbf{x} : A\mathbf{x} = \mathbf{0}\}$ — the kernel. | The slack in solutions: if $A\mathbf{x}_0 = \mathbf{b}$, then every solution is $\mathbf{x}_0 + \operatorname{null}(A)$. |
| Row space $\operatorname{row}(A) = C(A^\top)$ | $\mathbb{R}^n$ | Span of the rows of $A$ (viewed as vectors in $\mathbb{R}^n$). | The "shadow" of the row constraints; equals $\operatorname{null}(A)^\perp$. |
| Left null space $N(A^\top)$ | $\mathbb{R}^m$ | $\{\mathbf{y} : A^\top \mathbf{y} = \mathbf{0}\}$, i.e. $\mathbf{y}^\top A = \mathbf{0}$. | Linear combinations of the rows that annihilate $A$; equals $\operatorname{col}(A)^\perp$. |
Two pairs, two ambient spaces. In $\mathbb{R}^n$, the row space and null space are orthogonal complements; in $\mathbb{R}^m$, the column space and left null space are orthogonal complements. (The orthogonality uses the standard inner product — it lives in the next chapter, but the fact is too useful to omit here.)
Think of $A$ as a function $\mathbb{R}^n \to \mathbb{R}^m$. The null space is "everything that gets squashed to zero" on the input side. The column space is "everything reachable" on the output side. The row and left-null spaces are the orthogonal complements that account for the rest.
9. Rank–nullity theorem
The dimensions of the four subspaces aren't independent — they're locked together by one of the cleanest theorems in linear algebra.
For any $m \times n$ matrix $A$,
$\dim(\operatorname{col}(A)) + \dim(\operatorname{null}(A)) = n$.
The first dimension is the rank of $A$; the second is the nullity. Together they account for every dimension on the input side.
The picture: row-reduce $A$. Every column that ends up with a pivot contributes an independent column to $\operatorname{col}(A)$, so the number of pivot columns is $\operatorname{rank}(A)$. Every non-pivot column corresponds to a free variable in solving $A\mathbf{x} = \mathbf{0}$, and the free variables parametrize $\operatorname{null}(A)$. Pivot columns plus free columns sum to $n$ — that's the theorem.
And one more identity falls out for free: $\dim(\operatorname{row}(A)) = \dim(\operatorname{col}(A)) = \operatorname{rank}(A)$. The row rank and the column rank of a matrix are always equal — a fact that has the slightly miraculous flavor of "of course they are, but why?" The cleanest proof goes through row reduction: row operations preserve the row space and the rank of the column space, and after reduction both ranks are obviously the number of pivots.
Putting it together for an $m \times n$ matrix of rank $r$:
$$ \dim(\operatorname{col}(A)) = r,\quad \dim(\operatorname{row}(A)) = r,\quad \dim(\operatorname{null}(A)) = n - r,\quad \dim(N(A^\top)) = m - r. $$That single number $r$ controls the dimensions of all four fundamental subspaces. Once you know it, everything else is bookkeeping.
10. Common pitfalls
A surface like $\{(x, y) : x + y = 1\}$ looks linear — it's a straight line — but it isn't a subspace because it doesn't pass through the origin. Every subspace contains the zero vector of its parent space, no exceptions.
A linear combination is a single vector. The span is the entire set of all such vectors. Saying "the span of $\{\mathbf{v}\}$ is $2\mathbf{v}$" mixes up the two — the span is the whole line $\{t\mathbf{v} : t \in \mathbb{R}\}$.
A set of one nonzero vector $\{\mathbf{v}\}$ is independent; the zero vector all by itself is dependent (because $1 \cdot \mathbf{0} = \mathbf{0}$ is a nontrivial relation). Independence is a property of a set, not a single vector — get used to phrases like "the set $\{\mathbf{v}_1, \mathbf{v}_2\}$ is linearly independent."
If $P$ has the columns of $B$ written in the $C$-basis, then $P$ converts $B$-coordinates into $C$-coordinates: $[\mathbf{v}]_C = P\,[\mathbf{v}]_B$. The inverse $P^{-1}$ goes the other way. Wrong direction is the single most common bug in change-of-basis problems.
Most theorems you'll meet in this topic — invariance of dimension, exchange lemma, rank–nullity — are stated and proved for finite-dimensional spaces. In $C[0,1]$ or $\mathbb{R}[x]$, "basis" needs more care (you need the axiom of choice to know one exists), not every subspace has a complement, and dimension as a counting number stops making sense. When you cross into function spaces, switch frameworks (norms, completeness, Hilbert/Banach structure) rather than blindly extending finite-dim intuition.
11. Worked examples
Example 1 · Is $W = \{(x, y, z) : x + 2y - z = 0\}$ a subspace of $\mathbb{R}^3$?
Apply the three-point test.
(i) Zero. Plug in $(0, 0, 0)$: $0 + 0 - 0 = 0$. ✓
(ii) Addition. If $(x_1, y_1, z_1)$ and $(x_2, y_2, z_2)$ both satisfy the equation, then adding:
$$ (x_1 + x_2) + 2(y_1 + y_2) - (z_1 + z_2) = (x_1 + 2y_1 - z_1) + (x_2 + 2y_2 - z_2) = 0 + 0 = 0. $$So the sum also satisfies it. ✓
(iii) Scalar. For $c \in \mathbb{R}$: $cx + 2cy - cz = c(x + 2y - z) = c \cdot 0 = 0$. ✓
All three pass, so $W$ is a subspace. Geometrically it's a plane through the origin in $\mathbb{R}^3$ — and every plane through the origin is a 2D subspace.
Example 2 · Are $(1, 2, 3),\, (2, 4, 6),\, (0, 1, 0)$ linearly independent?
Eyeball check: $(2, 4, 6) = 2 \cdot (1, 2, 3)$. That's already a nontrivial linear dependence:
$$ 2 \cdot (1, 2, 3) + (-1) \cdot (2, 4, 6) + 0 \cdot (0, 1, 0) = \mathbf{0}. $$So the three vectors are dependent. The span is at most 2-dimensional — in fact exactly 2, since $(1, 2, 3)$ and $(0, 1, 0)$ alone are independent (no scalar multiple of one equals the other).
Example 3 · Find the coordinates of $(7, 1)$ in the basis $B = \{(1, 1), (1, -1)\}$.
Solve $a(1, 1) + b(1, -1) = (7, 1)$:
$$ \begin{aligned} a + b &= 7 \\ a - b &= 1 \end{aligned} $$Add: $2a = 8 \Rightarrow a = 4$. Subtract: $2b = 6 \Rightarrow b = 3$.
Therefore $[(7, 1)]_B = (4, 3)$. Sanity check: $4(1, 1) + 3(1, -1) = (7, 1)$. ✓
Example 4 · Find a basis for the null space of $A = \begin{pmatrix} 1 & 2 & 0 \\ 0 & 0 & 1 \end{pmatrix}$.
Solve $A\mathbf{x} = \mathbf{0}$. From row 1: $x_1 + 2x_2 = 0 \Rightarrow x_1 = -2x_2$. From row 2: $x_3 = 0$.
The variable $x_2$ is free; write $x_2 = t$:
$$ \mathbf{x} = \begin{pmatrix} -2t \\ t \\ 0 \end{pmatrix} = t \begin{pmatrix} -2 \\ 1 \\ 0 \end{pmatrix}. $$So $\operatorname{null}(A) = \operatorname{span}\{(-2, 1, 0)\}$, a one-dimensional subspace of $\mathbb{R}^3$. A basis is $\{(-2, 1, 0)\}$.
Rank–nullity check: $\dim(\operatorname{col}(A))$ is the number of independent columns; columns 1 and 3 are independent and column 2 is twice column 1, so $\operatorname{rank}(A) = 2$. Indeed $2 + 1 = 3 = n$. ✓
Example 5 · Show that the space of $2 \times 2$ symmetric matrices has dimension 3.
A symmetric $2 \times 2$ matrix has the form $\begin{pmatrix} a & b \\ b & c \end{pmatrix}$ — three free parameters $a, b, c$.
Write any such matrix as
$$ \begin{pmatrix} a & b \\ b & c \end{pmatrix} = a\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} + b\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} + c\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}. $$Those three matrices span the symmetric subspace (every symmetric matrix is a combination of them) and are independent (a combination equal to the zero matrix forces $a = b = c = 0$). So they form a basis, and $\dim = 3$.
Bonus: $\dim M_{2 \times 2}(\mathbb{R}) = 4$, and the symmetric matrices are a 3D subspace inside it. The "missing" dimension is the antisymmetric direction $\begin{pmatrix}0 & 1\\-1 & 0\end{pmatrix}$.
Example 6 · Use rank–nullity to predict when $A\mathbf{x} = \mathbf{b}$ has a unique solution.
For an $m \times n$ matrix, $A\mathbf{x} = \mathbf{b}$ has a unique solution (when it has any) iff $\operatorname{null}(A) = \{\mathbf{0}\}$, i.e. $\dim(\operatorname{null}(A)) = 0$. By rank–nullity,
$$ \dim(\operatorname{null}(A)) = 0 \quad\Longleftrightarrow\quad \operatorname{rank}(A) = n. $$So you need $n$ pivot columns. Two consequences:
- If $A$ is square ($m = n$) and $\operatorname{rank}(A) = n$, the unique solution always exists — $A$ is invertible.
- If $A$ is "wide" ($n > m$), $\operatorname{rank}(A) \leq m < n$, so the null space is nontrivial and you'll have infinitely many solutions whenever solutions exist.
The framework predicts the solution count before you do any arithmetic — that's the practical payoff of the theorem.