Counting and Combinatorics — Statistics & Probability

What you'll leave with

The multiplication principle — the one rule that everything else specializes from.
Permutations $P(n,k) = n!/(n-k)!$ for when order matters, and combinations $\binom{n}{k} = n!/(k!\,(n-k)!)$ for when it doesn't.
The four sampling models — ordered/unordered crossed with replacement/no replacement — and which formula each one selects.
The binomial theorem and Pascal's triangle as two pictures of the same numbers.
Stars and bars for distributing identical items, and inclusion–exclusion for unions of overlapping events.

1. Why probability is counting

Whenever a sample space is finite and every outcome is equally likely, probability collapses to a single ratio:

$$ P(A) = \frac{|A|}{|S|} = \frac{\text{number of favorable outcomes}}{\text{total number of outcomes}}. $$

That's the entire calculation. The only thing standing between you and a probability is the ability to count $|A|$ and $|S|$ — and for any non-trivial problem, "count" doesn't mean "list them all out." It means: recognize the structure, pick the right formula, and let it do the listing for you.

This is why combinatorics is the first real skill in probability. The mechanics of $P(A) = |A|/|S|$ are trivial; the difficulty is always in the numerator and denominator. A poker hand has $|S| = \binom{52}{5} = 2{,}598{,}960$ — try enumerating that by hand. The lottery has $|S| = \binom{49}{6} \approx 14$ million. Without combinatorics, these questions have no answers; with combinatorics, they reduce to one-line computations.

The two questions to ask

Every counting problem comes down to two yes/no flags: does order matter? and can items repeat? Get those two right and the formula picks itself. Get them wrong and no amount of algebra will save you.

2. The multiplication principle

Multiplication principle (rule of product)

If a process consists of $k$ stages, and stage $i$ can be done in $n_i$ ways independently of the previous choices, then the total number of ways to complete the process is

$n_1 \cdot n_2 \cdots n_k.$

Every other counting formula on this page is a special case of this one rule. The reason it works is mechanical: each choice at stage $i$ produces $n_i$ branches in a decision tree, and the total number of leaves is the product of branch counts.

A familiar example: a license plate with $3$ letters followed by $3$ digits. The letters give $26 \cdot 26 \cdot 26$ choices; the digits give $10 \cdot 10 \cdot 10$; the total is $26^3 \cdot 10^3 = 17{,}576{,}000$. Two fair dice rolled give $6 \cdot 6 = 36$ equally likely outcomes. A menu with $4$ appetizers, $7$ mains, and $3$ desserts gives $4 \cdot 7 \cdot 3 = 84$ three-course meals.

"Independently of previous choices"

The multiplication principle assumes each stage's count doesn't depend on what was chosen before. If your second draw from a deck depends on what the first card was — say, "draw a second card of the same suit" — the count at stage 2 is conditional and you have to think more carefully. Often the cleanest fix is to redefine the stages so each one's count is fixed.

The addition principle (rule of sum)

The multiplication principle's quieter sibling. If you can do task $A$ in $n_A$ ways or task $B$ in $n_B$ ways and the two sets of outcomes are disjoint, the number of ways to do "$A$ or $B$" is

$$ n_A + n_B. $$

The trigger word is "or" between mutually exclusive options: a meal that's either an appetizer ($4$ choices) or a dessert ($3$ choices) — without ordering both — gives $4 + 3 = 7$ single-course meals. If the options aren't disjoint, you've overcounted and need inclusion–exclusion (§8) instead.

3. Permutations: order matters

A permutation is an ordered arrangement. If you have $n$ distinct objects and you want to arrange all $n$ of them in a row, the multiplication principle gives

$$ n \cdot (n-1) \cdot (n-2) \cdots 2 \cdot 1 = n! $$

— $n$ choices for the first position, $n-1$ left for the second, and so on down to one for the last. The factorial $n!$ is the count of orderings of $n$ distinct items. Five people in a row: $5! = 120$ seatings. Shuffling a 52-card deck: $52! \approx 8.07 \times 10^{67}$ — more than the number of atoms in our galaxy.

Often you don't want to arrange all $n$ objects, just choose $k$ of them and put them in order. That's a $k$-permutation, written $P(n,k)$:

$$ P(n, k) \;=\; n \cdot (n-1) \cdots (n - k + 1) \;=\; \frac{n!}{(n-k)!}. $$

The first formula is the multiplication principle directly — $k$ stages, with the choices shrinking by one each time. The second is just bookkeeping: write the full $n!$ and cancel the tail $(n-k)!$ you didn't use.

Picking gold, silver, and bronze from $10$ runners: $P(10, 3) = 10 \cdot 9 \cdot 8 = 720$. Notice that "the first three across the line" is genuinely ordered — gold ≠ silver ≠ bronze — so a permutation, not a combination, is what you want.

Notation

$P(n,k)$, $_nP_k$, and $n^{\underline{k}}$ (the "falling factorial") all denote the same thing. They show up under different names in different textbooks; the formula is the only thing you need to remember.

4. Combinations: order doesn't

A combination is an unordered selection — a set, not a list. If you have $n$ distinct objects and you want to pick $k$ of them with no regard to order, the count is

$$ \binom{n}{k} \;=\; \frac{n!}{k!\,(n-k)!} \;=\; \frac{P(n,k)}{k!}. $$

The reasoning runs in two steps. First, count the ordered selections: $P(n,k) = n!/(n-k)!$. Then realize you've overcounted: every set of $k$ objects appears in $k!$ different orders. Divide out that overcount and you're left with the count of sets.

The symbol $\binom{n}{k}$ — read "$n$ choose $k$" — is the binomial coefficient, and it's the single most important number in combinatorics. Pick $3$ committee members from $10$ candidates: $\binom{10}{3} = 120$. Deal a $5$-card poker hand: $\binom{52}{5} = 2{,}598{,}960$. Pick lottery numbers — $6$ out of $49$: $\binom{49}{6} = 13{,}983{,}816$.

Identities worth knowing

Identity	Why
$\binom{n}{k} = \binom{n}{n-k}$	Choosing $k$ to include is the same as choosing $n-k$ to exclude.
$\binom{n}{0} = \binom{n}{n} = 1$	One way to choose nothing; one way to choose everything.
$\binom{n}{1} = n$	$n$ ways to choose a single item.
$\sum_{k=0}^{n} \binom{n}{k} = 2^n$	Total number of subsets of an $n$-element set.
$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$	Pascal's rule (next section). Either you pick item $n$ and choose $k-1$ from the rest, or you don't and you choose all $k$ from the rest.

5. The binomial theorem & Pascal's triangle

The binomial coefficients aren't just for counting hands of cards. They're also the coefficients you get when you expand a binomial raised to a power — and that connection is the bridge between combinatorics and algebra.

Binomial theorem

For any nonnegative integer $n$ and any numbers $x, y$:

$(x + y)^n \;=\; \displaystyle\sum_{k=0}^{n} \binom{n}{k} \, x^{n-k} \, y^k.$

The combinatorial proof is the prettiest. Expanding $(x+y)^n = (x+y)(x+y)\cdots(x+y)$ means multiplying out $n$ identical factors. Each term in the expansion is built by walking through the $n$ factors and choosing either an $x$ or a $y$ from each one. A term of the form $x^{n-k} y^k$ arises whenever you choose $y$ from exactly $k$ of the $n$ factors — and there are $\binom{n}{k}$ ways to pick that subset. The algebraic coefficient is the count.

So:

$$ (x+y)^4 = \binom{4}{0}x^4 + \binom{4}{1}x^3 y + \binom{4}{2}x^2 y^2 + \binom{4}{3} x y^3 + \binom{4}{4} y^4 $$ $$ \phantom{(x+y)^4} = x^4 + 4 x^3 y + 6 x^2 y^2 + 4 x y^3 + y^4. $$

Pascal's triangle

If you stack the binomial coefficients into a triangle — row $n$ holds $\binom{n}{0}, \binom{n}{1}, \ldots, \binom{n}{n}$ — every entry is the sum of the two directly above it. That's Pascal's rule, the last identity in the table above, drawn as a picture. Once you see it, you can compute $\binom{n}{k}$ for small $n$ without ever touching a factorial.

Rows $n = 0$ through $n = 6$. Row $4 = \{1, 4, 6, 4, 1\}$ — the binomial coefficients $\binom{4}{k}$ — is highlighted. Each entry is the sum of the two directly above it; row $n$ sums to $2^n$.

Pascal's triangle is more than a pretty picture. It's the recursion $\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$ made visible, and that recursion is how computers compute binomial coefficients without ever evaluating the giant factorials.

6. Permutations with repetition

So far we've assumed all $n$ objects are distinct. What if some are identical? Then ordering them in a row no longer gives $n!$ different-looking arrangements, because swapping two identical objects produces the same word.

If the $n$ objects come in groups of sizes $n_1, n_2, \ldots, n_k$ — where all items inside a group are identical and $n_1 + n_2 + \cdots + n_k = n$ — the number of distinguishable arrangements is

$$ \frac{n!}{n_1! \, n_2! \cdots n_k!}. $$

You start with the $n!$ orderings of treating everything as distinct, then divide out the $n_i!$ internal reorderings of each identical group that you can't actually see.

The classic example is the word MISSISSIPPI: $11$ letters with $1$ M, $4$ I's, $4$ S's, $2$ P's. The number of distinct rearrangements is

$$ \frac{11!}{1!\,4!\,4!\,2!} = 34{,}650. $$

The same number $\binom{n}{n_1, n_2, \ldots, n_k} = \dfrac{n!}{n_1!\,n_2!\,\cdots\,n_k!}$ is called the multinomial coefficient; it generalizes $\binom{n}{k}$ from two groups to $k$.

A different setup: $k$-sequences with repetition allowed from $n$ items, with order matters. Here every position can be any of $n$ items, independently, so the count is just $n^k$. A four-digit PIN over $10$ digits: $10^4 = 10{,}000$. The alphabet has $26^5 = 11{,}881{,}376$ five-letter strings.

7. Combinations with repetition (stars and bars)

The trickiest of the four sampling models. You have $r$ identical items to distribute among $n$ distinguishable bins, and you don't care about order — only how many items end up in each bin. Equivalently: you're choosing $r$ items from $n$ types where each type can be chosen any number of times. How many ways?

Stars and bars

The number of ways to place $r$ identical items into $n$ distinguishable bins (allowing empty bins) is

$\displaystyle\binom{n + r - 1}{r} = \binom{n+r-1}{n-1}.$

The trick — and the reason for the name — is to encode each distribution as a string of stars (★) and bars (|). Each star is one item; each bar is a partition between two bins. Read left to right, the stars before the first bar go in bin $1$, the stars between the first and second bars go in bin $2$, and so on.

For example, distribute $r = 5$ candies among $n = 3$ children. The string $\bigstar \bigstar | \bigstar | \bigstar \bigstar$ encodes "$2, 1, 2$": two candies to child A, one to B, two to C. The string $\bigstar \bigstar \bigstar \bigstar \bigstar | \, |$ encodes "$5, 0, 0$": all five to child A, none to the others.

Two stars-and-bars encodings of $5$ candies among $3$ children. Each string has $r = 5$ stars and $n - 1 = 2$ bars — a total of $7$ symbols. Every arrangement of those $7$ symbols is a valid distribution, so the count is $\binom{5 + 3 - 1}{5} = \binom{7}{5} = 21$.

The key observation: every distribution corresponds to a unique string of $r$ stars and $n-1$ bars, and every such string is a valid distribution. So we just need to count strings of $r + (n - 1)$ symbols where $r$ of them are stars — that's $\binom{n+r-1}{r}$.

Why this is "combinations with repetition": picking $r$ items from $n$ types with repetition allowed is exactly the same as deciding how many of each type you pick, which is exactly the same as putting $r$ items into $n$ type-bins. All three descriptions count the same thing.

8. Inclusion–exclusion

The multiplication principle counts how many ways things happen independently. Inclusion–exclusion is the tool for the opposite case: when you want the size of a union of sets that overlap, you can't just add their sizes — you'd double-count the overlap.

Two sets

For two sets $A$ and $B$:

$$ |A \cup B| = |A| + |B| - |A \cap B|. $$

You include each set's contribution, then exclude the intersection that got added twice. Pictorially, in a two-circle Venn diagram, the lens-shaped overlap is the part you have to subtract once to balance the accounting.

Three sets

For three sets, every pairwise overlap was double-counted (so subtract each), and the triple overlap was added in three times, subtracted out three times, and now needs to be added back once:

$$ |A \cup B \cup C| = |A| + |B| + |C| - |A\cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|. $$

Three-set Venn diagram. Each region is annotated with its inclusion–exclusion sign: singletons get $+$, pairwise overlaps get $-$, and the triple overlap gets $+$ again so the central region is counted exactly once.

The general pattern: add the singles, subtract the pairs, add the triples, subtract the quadruples, and so on with alternating signs. The signs are exactly what's needed so that every element of $A_1 \cup \cdots \cup A_n$ gets counted exactly once, regardless of how many of the sets it sits in.

For probabilities

Because probability on an equally-likely sample space is just counting divided by $|S|$, the same formula works for $P$:

$$ P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A\cap B) - P(A\cap C) - P(B\cap C) + P(A\cap B\cap C). $$

Inclusion–exclusion is the natural tool whenever a problem asks for "at least one of these things happens" and the events overlap. (When they don't overlap — when the events are mutually exclusive — every intersection is empty and the formula collapses to plain addition.)

9. The four sampling models, side by side

You're choosing $k$ items from a pool of $n$. Two yes/no flags determine the formula completely.

Order?	Repetition?	Count	Example
Ordered	With replacement	$n^k$	$4$-digit PIN over $10$ digits: $10^4$.
Ordered	Without replacement	$P(n,k) = \dfrac{n!}{(n-k)!}$	Gold/silver/bronze from $10$ runners: $10\cdot 9 \cdot 8$.
Unordered	Without replacement	$\dbinom{n}{k} = \dfrac{n!}{k!\,(n-k)!}$	$5$-card poker hand from $52$: $\binom{52}{5}$.
Unordered	With replacement	$\dbinom{n+k-1}{k}$	Buying $r$ scoops from $n$ ice-cream flavors: $\binom{n+r-1}{r}$.

Memorize the two flags, not the four formulas. Once you've correctly identified which row of the table you're in, the formula is mechanical.

10. The pigeonhole principle

A counting principle that looks obvious and yet is the key to a surprising number of theorems. The statement:

Pigeonhole principle

If you place $n + 1$ items into $n$ boxes, then at least one box must contain at least $2$ items.

More generally: if you place $m$ items into $n$ boxes, then some box contains at least $\lceil m/n \rceil$ items.

It feels too simple to be useful — but its power is that it converts an inequality between a count of items and a count of categories into a guaranteed coincidence. A few classic applications:

Birthdays. In any group of $367$ people, at least two share a birthday — there are only $366$ possible birthdays (including Feb 29).
Haircuts. In any city with more than a million people, at least two have the same number of hairs on their head (people have at most a few hundred thousand hairs).
Subsets. Among any $10$ integers chosen from $\{1, 2, \ldots, 100\}$, two different subsets have the same sum. (There are $2^{10} - 2 = 1022$ non-empty proper subsets, but only at most $10 \cdot 100 = 1000$ possible sums — so two subsets must collide.)

When to reach for it

The pigeonhole principle is the right tool whenever a problem asks you to prove that some collision or repetition must occur — not to count how many ways. If the problem is "show that something exists," count your items and your categories; if items beat categories, pigeonhole closes the deal.

11. Common pitfalls

Mixing up "order matters" and "order doesn't"

This is the single biggest source of wrong counts. Gold/silver/bronze: order matters → permutation. A three-person committee with no titles: order doesn't → combination. The same numbers $(n, k)$ give answers that differ by a factor of $k!$ — which is enormous even for small $k$. Always pause and ask: "if I swap two of my chosen items, do I get a genuinely different outcome?" If yes, it's ordered; if no, it isn't.

Forgetting whether items can repeat

Drawing balls from an urn with replacement versus without replacement is a completely different counting problem. With replacement: $n^k$. Without: $P(n,k)$ or $\binom{n}{k}$ depending on order. Read the problem statement carefully — "drawn without putting back," "no two the same," and "each item used at most once" all signal no replacement.

Adding $|A|$ and $|B|$ when they overlap

If $A$ and $B$ can both occur, $|A \cup B| \neq |A| + |B|$ — you've double-counted the intersection. Use inclusion–exclusion, or rephrase the problem in terms of the complement. A symptom of this mistake is getting a probability greater than $1$; if you ever see that, suspect a missed overlap.

Equally-likely outcomes — really?

The shortcut $P(A) = |A|/|S|$ silently assumes every outcome in $S$ is equally likely. If the sample space is "the sum of two dice is between 2 and 12," there are $11$ outcomes — but they aren't equiprobable, and dividing favorable by total will give nonsense. Always express $S$ in the most fine-grained, symmetric way you can (here: the $36$ ordered pairs $(d_1, d_2)$). When in doubt, drop down to the rawest sample space available.

"At least one" is almost always a complement

"At least one head in $10$ flips" looks like it wants a long sum: exactly one, exactly two, exactly three, .... Resist. The complement of "at least one head" is "zero heads" — a single, easy event. $P(\text{at least one H}) = 1 - P(\text{all T}) = 1 - (1/2)^{10}$. Whenever a question contains "at least one" and you find yourself summing many terms, ask whether the complement is easier.

12. Worked examples

Work each one through before opening the solution. The arithmetic is the easy part — what you're practicing is identifying which row of the sampling table the problem sits in.

Example 1 · How many five-card poker hands are there?

Identify the model. A poker hand is a set of $5$ cards — the order you were dealt them doesn't matter. Cards aren't repeated (you can't be dealt the same card twice). So: unordered, without replacement.

Apply.

$$ \binom{52}{5} = \frac{52!}{5! \, 47!} = \frac{52 \cdot 51 \cdot 50 \cdot 49 \cdot 48}{5 \cdot 4 \cdot 3 \cdot 2 \cdot 1} = 2{,}598{,}960. $$

That's the size of the sample space for any "what's the probability of getting hand X?" question in poker.

Example 2 · Probability of a flush in five-card poker

Set up. A flush is $5$ cards all in the same suit. There are $4$ suits to pick from, and within a suit there are $13$ cards to choose $5$ from.

Count favorable. By the multiplication principle,

$$ |A| = 4 \cdot \binom{13}{5} = 4 \cdot 1287 = 5148. $$

Divide.

$$ P(\text{flush}) = \frac{5148}{2{,}598{,}960} \approx 0.00198. $$

About one hand in $500$. (A purist would subtract straight and royal flushes from this count; the "any flush" number is what we computed.)

Example 3 · Distinct rearrangements of MISSISSIPPI

Set up. $11$ letters in total: $1$ M, $4$ I, $4$ S, $2$ P. This is permutations with repetition (some letters indistinguishable).

Apply.

$$ \frac{11!}{1! \, 4! \, 4! \, 2!} = \frac{39{,}916{,}800}{1 \cdot 24 \cdot 24 \cdot 2} = 34{,}650. $$

Without dividing out the repeated letters you'd get $11! \approx 40$ million — far too many, because swapping two I's gives the same word.

Example 4 · Distributing $10$ identical candies among $4$ children

Identify the model. The candies are identical (unordered) and there's no upper limit on how many one child can get (repetition allowed). This is stars and bars with $r = 10$ items and $n = 4$ bins.

Apply.

$$ \binom{n + r - 1}{r} = \binom{4 + 10 - 1}{10} = \binom{13}{10} = \binom{13}{3} = 286. $$

If we additionally required every child to get at least one candy, hand out $1$ to each first (using $4$ candies) and apply stars-and-bars to the remaining $6$: $\binom{6+4-1}{6} = \binom{9}{6} = 84$.

Example 5 · The birthday problem: $23$ people, probability of a shared birthday

Set up. Assume $365$ equally likely birthdays and independence. "At least one shared birthday" is awkward to count directly, but its complement — "all $23$ birthdays distinct" — is straightforward.

Count. The total sample space is $365^{23}$ (each person independently has $365$ choices). The favorable cases for "all distinct" are $P(365, 23) = 365 \cdot 364 \cdots 343$.

$$ P(\text{all distinct}) = \frac{365 \cdot 364 \cdots 343}{365^{23}} \approx 0.4927. $$

Complement.

$$ P(\text{at least one shared}) = 1 - 0.4927 \approx 0.5073. $$

So with just $23$ people the probability tips past $1/2$ — the surprise that gives the birthday problem its fame.

Example 6 · Inclusion–exclusion: divisible by $2$, $3$, or $5$

Set up. How many integers from $1$ to $100$ are divisible by $2$, $3$, or $5$? Let $A$, $B$, $C$ be the divisible-by-$2$, -$3$, -$5$ sets.

Apply.

$$ |A \cup B \cup C| = 50 + 33 + 20 - 16 - 10 - 6 + 3 = 74. $$

So $74$ of the first $100$ integers are divisible by at least one of $2, 3, 5$, and $26$ are coprime to $30$.

Sources & further reading

The vocabulary and formulas on this page are universal across combinatorics textbooks. The references below cover the same ground at varying levels of rigor — pick the one whose voice matches what you're after.

Counting, Permutations, and Combinations Textbook OpenStax · Introductory Statistics 2e, Ch. 3

Open-access statistics text whose probability chapter treats counting principles at exactly this level, with extensive drilled exercises. The closest thing to a canonical companion to this page.
Counting, permutations, and combinations Tutorial Khan Academy · Statistics & Probability

Short video lessons paired with practice problems and instant feedback. Best when you want to drill the order-vs-no-order decision until it's automatic.
Mathematics for Computer Science (6.042J) Course MIT OpenCourseWare · Lehman, Leighton, Meyer

Free MIT course with a substantial counting unit covering the multiplication principle, bijections, stars and bars, and inclusion–exclusion — with proofs and applications throughout. The textbook PDF is the deeper-reading recommendation.
Binomial Coefficient Reference Wolfram MathWorld

Formal mathematical reference for $\binom{n}{k}$: dozens of identities, generating-function connections, and links to related entries (Pascal's triangle, multinomial coefficient, falling factorial).
Inclusion–exclusion principle Encyclopedia Wikipedia

Broad overview with the general $n$-set formula, several worked examples (derangements, Euler's totient), and the connection to Möbius inversion. Useful when you're ready to push past three sets.
Pascal's triangle Encyclopedia Wikipedia

History, identities, and surprising appearances — Fibonacci numbers in the shallow diagonals, Sierpiński-triangle patterns mod $2$, and the connection to higher-dimensional simplex numbers.

What you'll leave with

1. Why probability is counting

2. The multiplication principle

The addition principle (rule of sum)

3. Permutations: order matters

4. Combinations: order doesn't

Identities worth knowing

5. The binomial theorem & Pascal's triangle

Pascal's triangle

6. Permutations with repetition

7. Combinations with repetition (stars and bars)

8. Inclusion–exclusion

Two sets

Three sets

For probabilities

9. The four sampling models, side by side

10. The pigeonhole principle

11. Common pitfalls

12. Worked examples

Sources & further reading

Test your understanding