Linear Transformations

What you'll leave with

The two-line algebraic definition of "linear" — and the picture it produces.
Why every matrix is a linear transformation, and vice versa.
The four 2-D transformations you should recognize on sight: rotation, reflection, scaling, shear.
Why matrix multiplication is defined the way it is — composition forces it.
The "track the basis vectors" recipe for writing down any linear transformation as a matrix.
A feel — by play — for how the four matrix entries push the grid around, and what makes the determinant flip sign or vanish.

1. What "linear" means

Linear transformation

A function $T : \mathbb{R}^n \to \mathbb{R}^m$ is linear if it respects both vector addition and scalar multiplication:

$$ T(\vec{u} + \vec{v}) = T(\vec{u}) + T(\vec{v}) \qquad \text{and} \qquad T(c\,\vec{v}) = c\,T(\vec{v}) $$

for all vectors $\vec{u}, \vec{v}$ and all scalars $c$.

The two conditions are usually rolled into one: $T$ is linear iff $T(a\vec{u} + b\vec{v}) = a\,T(\vec{u}) + b\,T(\vec{v})$. In words — you can distribute $T$ across linear combinations.

The picture is even better than the algebra. A linear transformation is anything you can do to the plane that keeps three rules in force:

Grid lines stay straight. No bending, no curving.
Grid lines stay parallel and evenly spaced. A grid maps to a (possibly skewed, scaled, rotated) grid.
The origin stays fixed. $T(\vec{0}) = \vec{0}$, always.

Anything else — folding paper, stretching one direction nonlinearly, shifting everything to the right by 3 — breaks the rules. The grid has to flex as one piece, anchored at the origin.

Original grid

Standard basis: $\vec{e}_1 = (1,0)$, $\vec{e}_2 = (0,1)$

After applying $T$ (a shear)

Grid stays grid. Origin stays put. Square becomes parallelogram.

Mental model

Don't picture a function as "a formula that eats numbers." Picture it as a motion of the plane. Pick up the entire transparent grid sheet, deform it (without tearing, folding, or unpinning the origin) until your two basis arrows point wherever you like — that motion is your linear transformation. Everything else in this topic is bookkeeping for that picture.

2. Every matrix gives a linear transformation

Here is the bridge between matrices and motion. Given an $m \times n$ matrix $A$, define

$$ T_A(\vec{x}) = A\vec{x}. $$

This is a function from $\mathbb{R}^n$ to $\mathbb{R}^m$, and it is automatically linear — matrix multiplication distributes over vector addition and scalar multiplication:

$$ A(\vec{u} + \vec{v}) = A\vec{u} + A\vec{v}, \qquad A(c\vec{v}) = c(A\vec{v}). $$

The converse — every linear transformation comes from a matrix — is the more surprising direction. It works because a linear $T$ is determined by very little information.

Write any $\vec{x} \in \mathbb{R}^n$ in standard-basis coordinates: $\vec{x} = x_1 \vec{e}_1 + x_2 \vec{e}_2 + \cdots + x_n \vec{e}_n$. Apply $T$ and unpack with linearity:

$$ T(\vec{x}) = x_1 T(\vec{e}_1) + x_2 T(\vec{e}_2) + \cdots + x_n T(\vec{e}_n). $$

So $T$ is completely determined by what it does to the $n$ basis vectors. Stack those $n$ output vectors as columns and you have the matrix:

$$ A = \Big[\; T(\vec{e}_1)\;\Big|\;T(\vec{e}_2)\;\Big|\;\cdots\;\Big|\;T(\vec{e}_n)\;\Big]. $$

And then $T(\vec{x}) = A\vec{x}$, by construction.

Linear transformations and matrices are the same objects spoken in two dialects. Matrices are how you compute; transformations are what is happening.

3. The standard 2D zoo

Four 2-D transformations show up so often that you should recognize their matrices at a glance. Each does exactly one thing to the unit square.

Rotation by angle $\theta$ (counter-clockwise)

$$ R_\theta = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \phantom{-}\cos\theta \end{pmatrix} $$

The columns are where $\vec{e}_1$ and $\vec{e}_2$ land: $\vec{e}_1 = (1,0)$ rotates to $(\cos\theta, \sin\theta)$, $\vec{e}_2 = (0,1)$ rotates to $(-\sin\theta, \cos\theta)$. The whole plane spins rigidly about the origin.

Reflection across the x-axis

$$ F_x = \begin{pmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{pmatrix} $$

$\vec{e}_1$ doesn't move; $\vec{e}_2$ flips down. Reflection across the y-axis is the mirror version: $\operatorname{diag}(-1, 1)$.

Uniform scaling by factor $s$

$$ S_s = \begin{pmatrix} s & 0 \\ 0 & s \end{pmatrix} = s\,I $$

Every vector is stretched (or shrunk) by the same factor. Non-uniform scaling $\operatorname{diag}(s_1, s_2)$ stretches the two axes independently — useful when one direction matters more than the other.

Horizontal shear by $k$

$$ H_k = \begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix} $$

$\vec{e}_1$ stays put; $\vec{e}_2$ slides $k$ units to the right (and stays at height $1$). The grid stays a grid, but the right-angle between the axes is broken — that's the diagram you saw in §1.

Transformation	Matrix	Where $\vec{e}_1$ lands	Where $\vec{e}_2$ lands
Rotation by $\theta$	$R_\theta$	$(\cos\theta, \sin\theta)$	$(-\sin\theta, \cos\theta)$
Reflection (x-axis)	$F_x$	$(1, 0)$	$(0, -1)$
Scaling by $s$	$S_s$	$(s, 0)$	$(0, s)$
Horizontal shear by $k$	$H_k$	$(1, 0)$	$(k, 1)$

Read the columns

Every entry in those matrices is just the coordinates of $T(\vec{e}_1)$ and $T(\vec{e}_2)$. Once you internalize that, you stop memorizing the matrices — you derive them on the fly.

4. Composition is matrix multiplication

Apply transformation $A$, then transformation $B$. What's the combined effect?

$$ (B \circ A)(\vec{x}) = B(A\vec{x}) = B(A\vec{x}) = (BA)\,\vec{x}. $$

The combined transformation is given by the matrix product $BA$ — with $B$ on the left, because $A$ acts first. This is the entire reason matrix multiplication is defined the way it is. Someone could have defined $A \times B$ to mean something else; nobody did, because then composing transformations would not match composing matrices, and the whole edifice would crack.

Concretely, let's compose a $90°$ rotation with a horizontal reflection in two orders:

$$ R_{90°} = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix}, \qquad F_x = \begin{pmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{pmatrix}. $$

Rotate first, then reflect ($F_x \circ R_{90°}$):

$$ F_x R_{90°} = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix} = \begin{pmatrix} \phantom{-}0 & -1 \\ -1 & \phantom{-}0 \end{pmatrix}. $$

Reflect first, then rotate ($R_{90°} \circ F_x$):

$$ R_{90°} F_x = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix} \begin{pmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}. $$

Different matrices. Different transformations. That's matrix multiplication telling you, correctly, that the order of geometric operations matters.

Read right-to-left

In $BA\vec{x}$, the matrix nearest the vector is the one applied first. "First $A$, then $B$" reads from right to left, like function composition. Drilling this until it feels automatic will save you hours of debugging later.

5. The columns tell you everything

If you take only one operational lesson from this page, take this one.

To find the matrix of a linear transformation $T$, apply $T$ to each standard basis vector and write the results as the columns.

Why it works was already laid out in §2: a linear transformation is determined by where it sends the basis, and those images are the columns of the matrix.

So you almost never need to compute matrix entries directly — you compute basis images.

Example: rotation by $90°$ from scratch

You don't have to remember $R_\theta$. Just rotate $\vec{e}_1$ and $\vec{e}_2$ by hand.

$\vec{e}_1 = (1,0)$ rotated $90°$ counter-clockwise lands at $(0,1)$.
$\vec{e}_2 = (0,1)$ rotated $90°$ counter-clockwise lands at $(-1,0)$.

Stack as columns:

$$ R_{90°} = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix}. $$

You just rederived the rotation matrix. No formula required.

Example: projection onto the x-axis

What's the matrix of the transformation that drops every vector vertically onto the x-axis?

$\vec{e}_1 = (1,0)$ is already on the x-axis. Image: $(1, 0)$.
$\vec{e}_2 = (0,1)$ collapses to the origin. Image: $(0, 0)$.

$$ P = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}. $$

Notice $P$ is singular — it squashes the plane down to a line, so it cannot be inverted. The matrix knew that before you did.

6. Kernel, image, and rank–nullity

Once you have a linear transformation $T : \mathbb{R}^n \to \mathbb{R}^m$, two subspaces deserve names. One lives in the input space and asks "what gets crushed?"; the other lives in the output space and asks "what can be reached?". Together they obey a conservation law that ties dimensions on both sides.

Kernel (null space)

The kernel of $T$ is the set of input vectors that $T$ sends to the zero vector:

$$ \ker(T) = \{\,\vec{x} \in \mathbb{R}^n : T(\vec{x}) = \vec{0}\,\}. $$

For a matrix transformation $T(\vec{x}) = A\vec{x}$, this is exactly the null space of $A$: the solution set of $A\vec{x} = \vec{0}$. It is always a subspace of the domain.

Image (range)

The image of $T$ is the set of output vectors $T$ can actually produce:

$$ \operatorname{im}(T) = \{\,T(\vec{x}) : \vec{x} \in \mathbb{R}^n\,\}. $$

For $T(\vec{x}) = A\vec{x}$, this is the column space of $A$ — the span of the columns. It is always a subspace of the codomain.

The two dimensions get their own names:

Rank of $T$ is $\dim \operatorname{im}(T)$ — how many independent directions survive on the output side.
Nullity of $T$ is $\dim \ker(T)$ — how many independent directions get collapsed to zero.

The rank also equals the number of linearly independent columns of the matrix — and, by a deep symmetry, the number of linearly independent rows.

Rank–nullity theorem

For any linear $T : \mathbb{R}^n \to \mathbb{R}^m$,

$$ \operatorname{rank}(T) + \operatorname{nullity}(T) = n, $$

where $n$ is the dimension of the domain. Every input dimension is accounted for: either it stretches out to contribute a new direction of output (rank), or it gets squashed to zero (nullity).

Reading rank and nullity off the standard zoo

Look at the same 2-D maps from §3 through this new lens.

Transformation	Matrix	$\dim \ker$	$\operatorname{rank}$	Check $n$
Rotation by $\theta$	$R_\theta$	$0$	$2$	$0 + 2 = 2$ ✓
Reflection (x-axis)	$F_x$	$0$	$2$	$0 + 2 = 2$ ✓
Uniform scaling by $s \neq 0$	$sI$	$0$	$2$	$0 + 2 = 2$ ✓
Projection onto x-axis	$\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}$	$1$	$1$	$1 + 1 = 2$ ✓
Zero map	$\begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}$	$2$	$0$	$2 + 0 = 2$ ✓

The projection is the interesting one: it collapses the y-direction (nullity 1 — the entire y-axis maps to $\vec{0}$) and keeps the x-direction alive (rank 1 — the image is the x-axis). The accounting balances.

When can you invert?

A square linear transformation $T : \mathbb{R}^n \to \mathbb{R}^n$ is invertible iff $\ker(T) = \{\vec{0}\}$ iff $\operatorname{rank}(T) = n$ iff its matrix has nonzero determinant. The three conditions are restatements of one another. Anything that crushes a direction (any vector in the kernel besides $\vec{0}$) loses information you can never recover.

A worked computation

Take $A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}$. The two columns are $\begin{pmatrix} 1 \\ 2 \end{pmatrix}$ and $\begin{pmatrix} 2 \\ 4 \end{pmatrix} = 2\begin{pmatrix} 1 \\ 2 \end{pmatrix}$ — the second is twice the first. The column space is a one-dimensional line, so $\operatorname{rank}(A) = 1$.

For the kernel, solve $A\vec{x} = \vec{0}$: both rows give $x_1 + 2x_2 = 0$, so $\vec{x} = t\begin{pmatrix} -2 \\ \phantom{-}1 \end{pmatrix}$ for any $t \in \mathbb{R}$. That's a one-dimensional line of solutions, so $\operatorname{nullity}(A) = 1$.

Check: $\operatorname{rank} + \operatorname{nullity} = 1 + 1 = 2 = n$. ✓

8. Common pitfalls

Composition order is right-to-left

"First $A$, then $B$" is the matrix product $BA$, not $AB$. Forgetting this is the single most common source of wrong answers in early linear algebra. When in doubt, slow down and trace what each matrix does to $\vec{e}_1$ — the visual check is fast and unambiguous.

Translation is not linear

The map $T(\vec{x}) = \vec{x} + \vec{b}$ (with $\vec{b} \neq \vec{0}$) looks harmless — it just shifts everything. But it moves the origin, so it fails the linearity test instantly: $T(\vec{0}) = \vec{b} \neq \vec{0}$. Translations are affine, not linear. The standard trick is homogeneous coordinates: pad vectors with a $1$ and translations become linear in $\mathbb{R}^{n+1}$.

If $T(\vec{0}) \neq \vec{0}$, $T$ is not linear

This is the fastest sanity check. Linearity forces $T(\vec{0}) = T(0 \cdot \vec{0}) = 0 \cdot T(\vec{0}) = \vec{0}$. If the function shifts the origin anywhere else, stop — it cannot be linear, no matter what else it does.

"Linear function" from high school isn't linear here

The high-school "linear function" $f(x) = mx + b$ is affine in the linear-algebra sense unless $b = 0$. The constant term moves the origin: $f(0) = b$. The two communities reuse the word and mean different things.

9. Worked examples

Each one is a self-contained drill. Try it cold before opening the solution.

Example 1 · Apply a rotation matrix to a vector

Rotate $\vec{v} = (3, 1)$ by $90°$ counter-clockwise.

Step 1. Write the rotation matrix:

$$ R_{90°} = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix}. $$

Step 2. Multiply:

$$ R_{90°}\vec{v} = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix} \begin{pmatrix} 3 \\ 1 \end{pmatrix} = \begin{pmatrix} 0\cdot 3 + (-1)\cdot 1 \\ 1\cdot 3 + 0\cdot 1 \end{pmatrix} = \begin{pmatrix} -1 \\ \phantom{-}3 \end{pmatrix}. $$

Sanity check. $(3,1)$ points down-right of the y-axis at an angle just above the x-axis; a $90°$ CCW rotation should land it in the upper-left quadrant. $(-1, 3)$ does. ✓

Example 2 · Find the matrix of a known transformation

Let $T$ reflect every vector across the line $y = x$. Find its matrix.

Step 1. Track $\vec{e}_1 = (1, 0)$. Reflecting $(1,0)$ across $y = x$ swaps coordinates: $T(\vec{e}_1) = (0, 1)$.

Step 2. Track $\vec{e}_2 = (0, 1)$. Same swap: $T(\vec{e}_2) = (1, 0)$.

Step 3. Stack as columns:

$$ A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}. $$

Verify. $A \begin{pmatrix} 3 \\ 7 \end{pmatrix} = \begin{pmatrix} 7 \\ 3 \end{pmatrix}$. Coordinates swapped — exactly what reflection across $y = x$ should do. ✓

Example 3 · Verify a function is (or isn't) linear

Decide whether each $T : \mathbb{R}^2 \to \mathbb{R}^2$ is linear.

(a) $T(x, y) = (2x - y,\; 3x + 4y)$.

Check the origin: $T(0, 0) = (0, 0)$. ✓

Each output coordinate is a linear combination of $x$ and $y$ with no constant term. Both addition and scaling distribute term-by-term, so $T$ is linear. In fact its matrix is $\begin{pmatrix} 2 & -1 \\ 3 & \phantom{-}4 \end{pmatrix}$ — just read off the coefficients.

(b) $T(x, y) = (x + 1,\; y)$.

Check the origin: $T(0, 0) = (1, 0) \neq (0, 0)$. Not linear — it's a translation in disguise. Stop here; no further work needed.

(c) $T(x, y) = (xy,\; y)$.

Origin check passes: $T(0, 0) = (0, 0)$. But scaling fails: $T(2 \cdot (1, 1)) = T(2, 2) = (4, 2)$, while $2 \cdot T(1, 1) = 2 \cdot (1, 1) = (2, 2)$. Different. Not linear — the product $xy$ ruins it.

Example 4 · Compose two transformations

Let $A$ scale by $2$ and $B$ rotate by $90°$ CCW. Find the matrix for "scale, then rotate".

Step 1. Identify the matrices.

$$ A = \begin{pmatrix} 2 & 0 \\ 0 & 2 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix}. $$

Step 2. "Scale then rotate" applies $A$ first, so the composition is $B A$ (right-to-left order):

$$ BA = \begin{pmatrix} 0 & -1 \\ 1 & \phantom{-}0 \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & 2 \end{pmatrix} = \begin{pmatrix} 0 & -2 \\ 2 & \phantom{-}0 \end{pmatrix}. $$

Step 3. Sanity check by tracking $\vec{e}_1 = (1, 0)$: scaling by 2 gives $(2, 0)$, then rotating gives $(0, 2)$. The first column of $BA$ is $(0, 2)$. ✓

Example 5 · Reverse a transformation by finding its inverse

Let $A$ shear the plane by $k = 3$: $A = \begin{pmatrix} 1 & 3 \\ 0 & 1 \end{pmatrix}$. Find $A^{-1}$, the transformation that undoes the shear.

Step 1. Geometric reasoning. The shear pushed $\vec{e}_2 = (0, 1)$ to $(3, 1)$. To undo it, the inverse must push $(3, 1)$ back to $(0, 1)$. Equivalently, $\vec{e}_2$ in the original needs to be pulled back by $3$ — a shear with $k = -3$.

Step 2. Predicted inverse:

$$ A^{-1} = \begin{pmatrix} 1 & -3 \\ 0 & \phantom{-}1 \end{pmatrix}. $$

Step 3. Verify by multiplying:

$$ A A^{-1} = \begin{pmatrix} 1 & 3 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & -3 \\ 0 & \phantom{-}1 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I. \;\checkmark $$

In general, $\begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix}^{-1} = \begin{pmatrix} 1 & -k \\ 0 & \phantom{-}1 \end{pmatrix}$. The inverse of a shear is the opposite shear — exactly what the picture predicted.

Sources & further reading

The content above is synthesized from established linear-algebra references. When something here reads ambiguously, the primary sources are ground truth; the visual resource is where to turn for the geometric soul of the subject.

Linear map Encyclopedia Wikipedia

Broad overview that treats linear maps between arbitrary vector spaces — not just $\mathbb{R}^n \to \mathbb{R}^m$. Useful for placing this topic in the wider mathematical landscape and finding the abstract definition that subsumes the matrix one.
Linear Transformation Reference Wolfram MathWorld

Formal, dense, precise. The right place to look when you want the definition stated in the working language of professional mathematicians, alongside related concepts like kernel and image.
Matrix transformations Tutorial Khan Academy · Linear Algebra

Video lessons and drill problems covering the matrix-of-a-transformation recipe, composition, and the standard 2-D maps. Best if you want to practice the mechanics until they feel automatic.
Essence of Linear Algebra Video 3Blue1Brown · Grant Sanderson

The visual companion this topic was practically written around. Episodes 3 ("Linear transformations and matrices") and 4 ("Matrix multiplication as composition") make the geometric story unmistakable — watch them once and the algebra will never feel arbitrary again.