1. Why probability is counting
Whenever a sample space is finite and every outcome is equally likely, probability collapses to a single ratio:
$$ P(A) = \frac{|A|}{|S|} = \frac{\text{number of favorable outcomes}}{\text{total number of outcomes}}. $$That's the entire calculation. The only thing standing between you and a probability is the ability to count $|A|$ and $|S|$ — and for any non-trivial problem, "count" doesn't mean "list them all out." It means: recognize the structure, pick the right formula, and let it do the listing for you.
This is why combinatorics is the first real skill in probability. The mechanics of $P(A) = |A|/|S|$ are trivial; the difficulty is always in the numerator and denominator. A poker hand has $|S| = \binom{52}{5} = 2{,}598{,}960$ — try enumerating that by hand. The lottery has $|S| = \binom{49}{6} \approx 14$ million. Without combinatorics, these questions have no answers; with combinatorics, they reduce to one-line computations.
Every counting problem comes down to two yes/no flags: does order matter? and can items repeat? Get those two right and the formula picks itself. Get them wrong and no amount of algebra will save you.
2. The multiplication principle
If a process consists of $k$ stages, and stage $i$ can be done in $n_i$ ways independently of the previous choices, then the total number of ways to complete the process is
$n_1 \cdot n_2 \cdots n_k.$
Every other counting formula on this page is a special case of this one rule. The reason it works is mechanical: each choice at stage $i$ produces $n_i$ branches in a decision tree, and the total number of leaves is the product of branch counts.
A familiar example: a license plate with $3$ letters followed by $3$ digits. The letters give $26 \cdot 26 \cdot 26$ choices; the digits give $10 \cdot 10 \cdot 10$; the total is $26^3 \cdot 10^3 = 17{,}576{,}000$. Two fair dice rolled give $6 \cdot 6 = 36$ equally likely outcomes. A menu with $4$ appetizers, $7$ mains, and $3$ desserts gives $4 \cdot 7 \cdot 3 = 84$ three-course meals.
The multiplication principle assumes each stage's count doesn't depend on what was chosen before. If your second draw from a deck depends on what the first card was — say, "draw a second card of the same suit" — the count at stage 2 is conditional and you have to think more carefully. Often the cleanest fix is to redefine the stages so each one's count is fixed.
The addition principle (rule of sum)
The multiplication principle's quieter sibling. If you can do task $A$ in $n_A$ ways or task $B$ in $n_B$ ways and the two sets of outcomes are disjoint, the number of ways to do "$A$ or $B$" is
$$ n_A + n_B. $$The trigger word is "or" between mutually exclusive options: a meal that's either an appetizer ($4$ choices) or a dessert ($3$ choices) — without ordering both — gives $4 + 3 = 7$ single-course meals. If the options aren't disjoint, you've overcounted and need inclusion–exclusion (§8) instead.
3. Permutations: order matters
A permutation is an ordered arrangement. If you have $n$ distinct objects and you want to arrange all $n$ of them in a row, the multiplication principle gives
$$ n \cdot (n-1) \cdot (n-2) \cdots 2 \cdot 1 = n! $$— $n$ choices for the first position, $n-1$ left for the second, and so on down to one for the last. The factorial $n!$ is the count of orderings of $n$ distinct items. Five people in a row: $5! = 120$ seatings. Shuffling a 52-card deck: $52! \approx 8.07 \times 10^{67}$ — more than the number of atoms in our galaxy.
Often you don't want to arrange all $n$ objects, just choose $k$ of them and put them in order. That's a $k$-permutation, written $P(n,k)$:
$$ P(n, k) \;=\; n \cdot (n-1) \cdots (n - k + 1) \;=\; \frac{n!}{(n-k)!}. $$The first formula is the multiplication principle directly — $k$ stages, with the choices shrinking by one each time. The second is just bookkeeping: write the full $n!$ and cancel the tail $(n-k)!$ you didn't use.
Picking gold, silver, and bronze from $10$ runners: $P(10, 3) = 10 \cdot 9 \cdot 8 = 720$. Notice that "the first three across the line" is genuinely ordered — gold ≠ silver ≠ bronze — so a permutation, not a combination, is what you want.
$P(n,k)$, $_nP_k$, and $n^{\underline{k}}$ (the "falling factorial") all denote the same thing. They show up under different names in different textbooks; the formula is the only thing you need to remember.
4. Combinations: order doesn't
A combination is an unordered selection — a set, not a list. If you have $n$ distinct objects and you want to pick $k$ of them with no regard to order, the count is
$$ \binom{n}{k} \;=\; \frac{n!}{k!\,(n-k)!} \;=\; \frac{P(n,k)}{k!}. $$The reasoning runs in two steps. First, count the ordered selections: $P(n,k) = n!/(n-k)!$. Then realize you've overcounted: every set of $k$ objects appears in $k!$ different orders. Divide out that overcount and you're left with the count of sets.
The symbol $\binom{n}{k}$ — read "$n$ choose $k$" — is the binomial coefficient, and it's the single most important number in combinatorics. Pick $3$ committee members from $10$ candidates: $\binom{10}{3} = 120$. Deal a $5$-card poker hand: $\binom{52}{5} = 2{,}598{,}960$. Pick lottery numbers — $6$ out of $49$: $\binom{49}{6} = 13{,}983{,}816$.
Identities worth knowing
| Identity | Why |
|---|---|
| $\binom{n}{k} = \binom{n}{n-k}$ | Choosing $k$ to include is the same as choosing $n-k$ to exclude. |
| $\binom{n}{0} = \binom{n}{n} = 1$ | One way to choose nothing; one way to choose everything. |
| $\binom{n}{1} = n$ | $n$ ways to choose a single item. |
| $\sum_{k=0}^{n} \binom{n}{k} = 2^n$ | Total number of subsets of an $n$-element set. |
| $\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$ | Pascal's rule (next section). Either you pick item $n$ and choose $k-1$ from the rest, or you don't and you choose all $k$ from the rest. |
5. The binomial theorem & Pascal's triangle
The binomial coefficients aren't just for counting hands of cards. They're also the coefficients you get when you expand a binomial raised to a power — and that connection is the bridge between combinatorics and algebra.
For any nonnegative integer $n$ and any numbers $x, y$:
$(x + y)^n \;=\; \displaystyle\sum_{k=0}^{n} \binom{n}{k} \, x^{n-k} \, y^k.$
The combinatorial proof is the prettiest. Expanding $(x+y)^n = (x+y)(x+y)\cdots(x+y)$ means multiplying out $n$ identical factors. Each term in the expansion is built by walking through the $n$ factors and choosing either an $x$ or a $y$ from each one. A term of the form $x^{n-k} y^k$ arises whenever you choose $y$ from exactly $k$ of the $n$ factors — and there are $\binom{n}{k}$ ways to pick that subset. The algebraic coefficient is the count.
So:
$$ (x+y)^4 = \binom{4}{0}x^4 + \binom{4}{1}x^3 y + \binom{4}{2}x^2 y^2 + \binom{4}{3} x y^3 + \binom{4}{4} y^4 $$ $$ \phantom{(x+y)^4} = x^4 + 4 x^3 y + 6 x^2 y^2 + 4 x y^3 + y^4. $$Pascal's triangle
If you stack the binomial coefficients into a triangle — row $n$ holds $\binom{n}{0}, \binom{n}{1}, \ldots, \binom{n}{n}$ — every entry is the sum of the two directly above it. That's Pascal's rule, the last identity in the table above, drawn as a picture. Once you see it, you can compute $\binom{n}{k}$ for small $n$ without ever touching a factorial.
Pascal's triangle is more than a pretty picture. It's the recursion $\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$ made visible, and that recursion is how computers compute binomial coefficients without ever evaluating the giant factorials.
6. Permutations with repetition
So far we've assumed all $n$ objects are distinct. What if some are identical? Then ordering them in a row no longer gives $n!$ different-looking arrangements, because swapping two identical objects produces the same word.
If the $n$ objects come in groups of sizes $n_1, n_2, \ldots, n_k$ — where all items inside a group are identical and $n_1 + n_2 + \cdots + n_k = n$ — the number of distinguishable arrangements is
$$ \frac{n!}{n_1! \, n_2! \cdots n_k!}. $$You start with the $n!$ orderings of treating everything as distinct, then divide out the $n_i!$ internal reorderings of each identical group that you can't actually see.
The classic example is the word MISSISSIPPI: $11$ letters with $1$ M, $4$ I's, $4$ S's, $2$ P's. The number of distinct rearrangements is
$$ \frac{11!}{1!\,4!\,4!\,2!} = 34{,}650. $$The same number $\binom{n}{n_1, n_2, \ldots, n_k} = \dfrac{n!}{n_1!\,n_2!\,\cdots\,n_k!}$ is called the multinomial coefficient; it generalizes $\binom{n}{k}$ from two groups to $k$.
A different setup: $k$-sequences with repetition allowed from $n$ items, with order matters. Here every position can be any of $n$ items, independently, so the count is just $n^k$. A four-digit PIN over $10$ digits: $10^4 = 10{,}000$. The alphabet has $26^5 = 11{,}881{,}376$ five-letter strings.
7. Combinations with repetition (stars and bars)
The trickiest of the four sampling models. You have $r$ identical items to distribute among $n$ distinguishable bins, and you don't care about order — only how many items end up in each bin. Equivalently: you're choosing $r$ items from $n$ types where each type can be chosen any number of times. How many ways?
The number of ways to place $r$ identical items into $n$ distinguishable bins (allowing empty bins) is
$\displaystyle\binom{n + r - 1}{r} = \binom{n+r-1}{n-1}.$
The trick — and the reason for the name — is to encode each distribution as a string of stars (★) and bars (|). Each star is one item; each bar is a partition between two bins. Read left to right, the stars before the first bar go in bin $1$, the stars between the first and second bars go in bin $2$, and so on.
For example, distribute $r = 5$ candies among $n = 3$ children. The string $\bigstar \bigstar | \bigstar | \bigstar \bigstar$ encodes "$2, 1, 2$": two candies to child A, one to B, two to C. The string $\bigstar \bigstar \bigstar \bigstar \bigstar | \, |$ encodes "$5, 0, 0$": all five to child A, none to the others.
The key observation: every distribution corresponds to a unique string of $r$ stars and $n-1$ bars, and every such string is a valid distribution. So we just need to count strings of $r + (n - 1)$ symbols where $r$ of them are stars — that's $\binom{n+r-1}{r}$.
Why this is "combinations with repetition": picking $r$ items from $n$ types with repetition allowed is exactly the same as deciding how many of each type you pick, which is exactly the same as putting $r$ items into $n$ type-bins. All three descriptions count the same thing.
8. Inclusion–exclusion
The multiplication principle counts how many ways things happen independently. Inclusion–exclusion is the tool for the opposite case: when you want the size of a union of sets that overlap, you can't just add their sizes — you'd double-count the overlap.
Two sets
For two sets $A$ and $B$:
$$ |A \cup B| = |A| + |B| - |A \cap B|. $$You include each set's contribution, then exclude the intersection that got added twice. Pictorially, in a two-circle Venn diagram, the lens-shaped overlap is the part you have to subtract once to balance the accounting.
Three sets
For three sets, every pairwise overlap was double-counted (so subtract each), and the triple overlap was added in three times, subtracted out three times, and now needs to be added back once:
$$ |A \cup B \cup C| = |A| + |B| + |C| - |A\cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|. $$The general pattern: add the singles, subtract the pairs, add the triples, subtract the quadruples, and so on with alternating signs. The signs are exactly what's needed so that every element of $A_1 \cup \cdots \cup A_n$ gets counted exactly once, regardless of how many of the sets it sits in.
For probabilities
Because probability on an equally-likely sample space is just counting divided by $|S|$, the same formula works for $P$:
$$ P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A\cap B) - P(A\cap C) - P(B\cap C) + P(A\cap B\cap C). $$Inclusion–exclusion is the natural tool whenever a problem asks for "at least one of these things happens" and the events overlap. (When they don't overlap — when the events are mutually exclusive — every intersection is empty and the formula collapses to plain addition.)
9. The four sampling models, side by side
You're choosing $k$ items from a pool of $n$. Two yes/no flags determine the formula completely.
| Order? | Repetition? | Count | Example |
|---|---|---|---|
| Ordered | With replacement | $n^k$ | $4$-digit PIN over $10$ digits: $10^4$. |
| Ordered | Without replacement | $P(n,k) = \dfrac{n!}{(n-k)!}$ | Gold/silver/bronze from $10$ runners: $10\cdot 9 \cdot 8$. |
| Unordered | Without replacement | $\dbinom{n}{k} = \dfrac{n!}{k!\,(n-k)!}$ | $5$-card poker hand from $52$: $\binom{52}{5}$. |
| Unordered | With replacement | $\dbinom{n+k-1}{k}$ | Buying $r$ scoops from $n$ ice-cream flavors: $\binom{n+r-1}{r}$. |
Memorize the two flags, not the four formulas. Once you've correctly identified which row of the table you're in, the formula is mechanical.
10. The pigeonhole principle
A counting principle that looks obvious and yet is the key to a surprising number of theorems. The statement:
If you place $n + 1$ items into $n$ boxes, then at least one box must contain at least $2$ items.
More generally: if you place $m$ items into $n$ boxes, then some box contains at least $\lceil m/n \rceil$ items.
It feels too simple to be useful — but its power is that it converts an inequality between a count of items and a count of categories into a guaranteed coincidence. A few classic applications:
- Birthdays. In any group of $367$ people, at least two share a birthday — there are only $366$ possible birthdays (including Feb 29).
- Haircuts. In any city with more than a million people, at least two have the same number of hairs on their head (people have at most a few hundred thousand hairs).
- Subsets. Among any $10$ integers chosen from $\{1, 2, \ldots, 100\}$, two different subsets have the same sum. (There are $2^{10} - 2 = 1022$ non-empty proper subsets, but only at most $10 \cdot 100 = 1000$ possible sums — so two subsets must collide.)
The pigeonhole principle is the right tool whenever a problem asks you to prove that some collision or repetition must occur — not to count how many ways. If the problem is "show that something exists," count your items and your categories; if items beat categories, pigeonhole closes the deal.