1. What a matrix is
A rectangular array of numbers arranged in rows and columns. A matrix with $m$ rows and $n$ columns is called an $m \times n$ matrix — rows first, columns second, always.
The convention is to name matrices with capital letters: $A$, $B$, $M$. The number sitting in row $i$, column $j$ of a matrix $A$ is called the $(i,j)$ entry, written $a_{ij}$ — lowercase letter, row index first, column index second.
For example, the matrix
$$ A = \begin{pmatrix} 2 & -1 & 0 \\ 3 & 4 & 5 \end{pmatrix} $$is a $2 \times 3$ matrix: two rows, three columns. Its entries are $a_{11} = 2$, $a_{12} = -1$, $a_{13} = 0$, $a_{21} = 3$, $a_{22} = 4$, $a_{23} = 5$. The first index always picks a row; the second always picks a column. Mix them up and you've referenced a different number.
You'll see matrices written with parentheses $\begin{pmatrix} \cdots \end{pmatrix}$ or square brackets $\begin{bmatrix} \cdots \end{bmatrix}$. They mean exactly the same thing — pick one and stay consistent. This page uses parentheses.
A $1 \times n$ matrix is a row vector; an $m \times 1$ matrix is a column vector. Vectors aren't a separate species — they're just matrices with one of the dimensions equal to 1, and every rule on this page applies to them.
2. Addition and scalar multiplication
The friendly operations first. Both work entry-wise — you do the operation independently at each $(i,j)$ position, and the result is a new matrix the same shape as the input.
Addition
If $A$ and $B$ are both $m \times n$, then $A + B$ is the $m \times n$ matrix with
$$ (A + B)_{ij} \;=\; a_{ij} + b_{ij}. $$Concretely:
$$ \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} + \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix} = \begin{pmatrix} 6 & 8 \\ 10 & 12 \end{pmatrix}. $$The shapes must match. You cannot add a $2 \times 3$ matrix to a $3 \times 2$ matrix, or to a $2 \times 2$ matrix — even though all three contain six numbers. The grid has to line up.
Scalar multiplication
Multiplying a matrix $A$ by a number (a scalar) $c$ scales every entry:
$$ (cA)_{ij} \;=\; c \cdot a_{ij}. $$For example,
$$ 3 \cdot \begin{pmatrix} 1 & -2 \\ 0 & 4 \end{pmatrix} = \begin{pmatrix} 3 & -6 \\ 0 & 12 \end{pmatrix}. $$If you already have the feel for vector addition and scaling, matrix addition and scaling are exactly the same operations — just on a 2-D grid instead of a 1-D list. Addition is commutative ($A + B = B + A$), associative, and distributive over scalar multiplication. The interesting break with intuition is coming up in the next section.
3. Matrix multiplication
Here is where matrices stop behaving like grids of numbers and start behaving like something else entirely. The definition looks bizarre on first contact — and it is, until you see why it's that way in the next topic. For now, the rule:
If $A$ is $m \times n$ and $B$ is $n \times p$, then the product $AB$ is the $m \times p$ matrix whose $(i,j)$ entry is the dot product of row $i$ of $A$ with column $j$ of $B$:
$$ (AB)_{ij} \;=\; \sum_{k=1}^{n} a_{ik}\, b_{kj}. $$Two things to absorb from that definition. First, the shape constraint: the inner dimensions must match. The number of columns of $A$ has to equal the number of rows of $B$ — otherwise the row and column you're trying to dot don't have the same length, and the operation is undefined. The outer dimensions become the shape of the result.
$$ \underbrace{A}_{m \times n} \;\cdot\; \underbrace{B}_{n \times p} \;=\; \underbrace{AB}_{m \times p}. $$Read the shapes as a chain: the two $n$'s in the middle have to agree, then they cancel, and what's left on the outside is the shape of the answer. A $2 \times 3$ times a $3 \times 4$ gives a $2 \times 4$. A $5 \times 2$ times a $2 \times 1$ gives a $5 \times 1$. A $2 \times 3$ times a $4 \times 5$ gives nothing — undefined.
The row-times-column pattern
To compute a single entry of the product, you slide row $i$ of $A$ along column $j$ of $B$, multiply pair-by-pair, and add. The diagram below shows the geometry of one entry.
A worked numerical version: let
$$ A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}, \qquad B = \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix}. $$Both are $2 \times 2$, so the inner dimensions match and the product is $2 \times 2$. Computing each entry:
$$ \begin{aligned} (AB)_{11} &= 1 \cdot 5 + 2 \cdot 7 = 19, \\ (AB)_{12} &= 1 \cdot 6 + 2 \cdot 8 = 22, \\ (AB)_{21} &= 3 \cdot 5 + 4 \cdot 7 = 43, \\ (AB)_{22} &= 3 \cdot 6 + 4 \cdot 8 = 50, \end{aligned} $$so
$$ AB = \begin{pmatrix} 19 & 22 \\ 43 & 50 \end{pmatrix}. $$It is tempting — really tempting — to look at $AB$ and define it as the matrix whose $(i,j)$ entry is $a_{ij} \cdot b_{ij}$. That operation exists (it's called the Hadamard product), but it is not what $AB$ means in standard notation. The "row times column" rule is the one that does useful work everywhere else in linear algebra. Memorize the difference now.
4. Multiplication is not commutative
For ordinary numbers, $3 \cdot 5 = 5 \cdot 3$. For matrices, this familiar fact fails. In general,
$$ AB \neq BA. $$This breaks intuition in two distinct ways, and you have to internalize both.
One product may exist while the other doesn't
If $A$ is $2 \times 3$ and $B$ is $3 \times 4$, then $AB$ is a $2 \times 4$ matrix — perfectly fine. But $BA$ would need the columns of $B$ (four of them) to match the rows of $A$ (two), and four does not equal two. So $BA$ is undefined. One direction works; the other doesn't even make sense.
Even when both are defined, they're usually different
When both $A$ and $B$ are square and the same size — say both $2 \times 2$ — both products $AB$ and $BA$ are defined and have the same shape. They are still, almost always, different matrices. A small example:
$$ A = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}. $$ $$ AB = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}, \qquad BA = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}. $$Same two matrices, opposite order — and not just different numbers, but different patterns. The order matters all the way down.
Once you learn that a matrix is a linear transformation, this stops being surprising and starts being obvious. Rotating a picture and then stretching it doesn't produce the same picture as stretching it and then rotating it. The non-commutativity of matrix multiplication is the non-commutativity of "do this, then that."
5. Special matrices
A few shapes and patterns earn their own names because they show up over and over again.
Square matrices
A matrix is square if it has the same number of rows as columns — an $n \times n$ matrix for some $n$. Only square matrices can be multiplied by themselves (so only square matrices have powers $A^2$, $A^3$, $\ldots$), and only square matrices can have inverses or determinants. Many of the deepest results in linear algebra are about square matrices.
The identity matrix
The $n \times n$ identity matrix $I_n$ has 1s along the main diagonal and 0s everywhere else:
$$ I_2 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, \qquad I_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}. $$The identity matrix plays the role for matrix multiplication that the number 1 plays for ordinary multiplication. For any $m \times n$ matrix $A$,
$$ I_m \, A \;=\; A \;=\; A \, I_n. $$Multiplying by $I$ leaves a matrix unchanged. (Note the subscripts: the sizes have to match correctly on each side. People often drop the subscript and write just $I$ when the size is clear from context.)
The zero matrix
The zero matrix $0_{m \times n}$ has every entry equal to zero. It's the additive identity: $A + 0 = A$. Multiplying any matrix by a (size-compatible) zero matrix gives a zero matrix.
Diagonal and triangular matrices
A square matrix is diagonal if every entry off the main diagonal is zero — only $a_{ii}$ may be non-zero. The identity is the special case where the diagonal entries are all 1. Multiplying a diagonal matrix by a vector simply rescales each coordinate independently, which is why diagonal matrices are the easiest matrices in linear algebra to reason about.
$$ D = \begin{pmatrix} 3 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 5 \end{pmatrix}. $$A square matrix is upper triangular if every entry below the main diagonal is zero, and lower triangular if every entry above the diagonal is zero. The diagonal entries themselves are free to be anything.
$$ U = \begin{pmatrix} 2 & 7 & -1 \\ 0 & 4 & 3 \\ 0 & 0 & 6 \end{pmatrix}, \qquad L = \begin{pmatrix} 2 & 0 & 0 \\ 5 & 4 & 0 \\ 1 & 9 & 6 \end{pmatrix}. $$Triangular matrices show up everywhere because many algorithms (Gaussian elimination, LU decomposition) produce them on purpose — they're easy to solve systems with.
Transpose
The transpose of a matrix $A$, written $A^T$, is the matrix you get by swapping rows and columns — entry $a_{ij}$ moves to position $(j,i)$. If $A$ is $m \times n$, then $A^T$ is $n \times m$.
$$ A = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix} \quad\Longrightarrow\quad A^T = \begin{pmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{pmatrix}. $$Two facts about transpose you will reach for often: $(A^T)^T = A$ (transposing twice gets you back), and $(AB)^T = B^T A^T$ — the transpose of a product reverses the order. A matrix that equals its own transpose ($A = A^T$) is called symmetric; symmetric matrices form their own important world.
Inverse
For a square matrix $A$, an inverse is a matrix $A^{-1}$ — also square and the same size — that satisfies
$$ A \, A^{-1} \;=\; A^{-1} \, A \;=\; I. $$If such an $A^{-1}$ exists, $A$ is called invertible (or non-singular); if not, $A$ is singular. Only square matrices can have inverses, and not every square matrix does — the zero matrix obviously doesn't, and neither does $\begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}$, whose rows are proportional. The inverse, when it exists, is the linear transformation that undoes $A$. Two facts to file away: $(A^{-1})^{-1} = A$, and $(AB)^{-1} = B^{-1} A^{-1}$ — like the transpose, the inverse of a product reverses the order.
6. Why matrices matter
Everything above looks like bookkeeping. Why would mathematicians have agreed on this particular, strange definition of multiplication — and why would the rest of mathematics build so much on top of it?
The short answer, and the topic of the next page: every matrix is a linear transformation. An $m \times n$ matrix $A$ is shorthand for a function that takes an $n$-dimensional input vector and returns an $m$-dimensional output vector, by the rule $\mathbf{x} \mapsto A\mathbf{x}$. Rotations, reflections, projections, shears, and scalings of space are all matrices. Differentiating a polynomial up to degree five is a matrix. Looking up the next state of a Markov chain is a matrix.
And the row-times-column multiplication rule? It's exactly what you have to do if you want $A(B\mathbf{x})$ — apply $B$ to $\mathbf{x}$, then apply $A$ — to equal $(AB)\mathbf{x}$. In other words, matrix multiplication is composition of linear transformations. It is non-commutative for the same reason composing functions is: doing one thing and then another is not the same as doing them in the other order.
A matrix is not a grid of numbers. A matrix is a function pretending to be a grid of numbers, and the grid only exists so you can compute with it.
That reframing is the whole point of linear algebra, and it's where this thread continues next.