You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: posts/clt-intuitive-derivation/index.qmd
+31-14Lines changed: 31 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ draft: false
17
17
18
18
The **Central Limit Theorem (CLT)** answers an important question: Why does the **bell curve** (or Normal Distribution) show up everywhere in the real world?
19
19
20
-
{fig-alt="Normal Distribution bell curve diagram showing the 68-95-99.7 rule with standard deviations from the mean" width="60%"}
20
+
{fig-alt="Normal Distribution bell curve diagram showing the 68-95-99.7 rule with standard deviations from the mean" width="600"}
21
21
22
22
Specifically, the theorem describes what happens when you take a random variable, $X$, and repeat the experiment many times to get a series of outcomes, $X_1, X_2, \dots, X_m$. What does the distribution of their **sum** ($S_m = X_1 + \dots + X_m$) or their **average** ($\bar{X}_m = S_m/m$) look like when $m$ is very large?
23
23
@@ -41,35 +41,52 @@ The individual randomness gets "averaged out," and a predictable, universal shap
41
41
42
42
## **Encoding Probabilities (The PGF)**
43
43
44
-
First, we need a way to mathematically describe our single die. We'll package its probabilities into a special polynomial called a **Probability Generating Function (PGF)**. This isn't just a convenient trick; it's a cornerstone of **[combinatorics](https://en.wikipedia.org/wiki/Combinatorics)** for solving counting problems.
44
+
To get to the heart of the Central Limit Theorem, we need a mathematical tool to describe what happens when we sum up many random variables, like the rolls of a die. Let's try to build such a tool from scratch.
45
45
46
-
### A gentler start: probabilities as polynomials
46
+
The core challenge is this: when we add the scores from two dice, their outcomes *add*, but their independent probabilities *multiply*. How can we find a mathematical representation that captures this dual behavior?
47
47
48
-
Imagine you want to count the number of ways to form 10 Rupees using a collection of 1, 2, and 5 Rupee coins. This is a classic combinatorial problem. A powerful way to solve it is to represent the available coins as polynomials:
48
+
Let's focus on the outcomes first. Suppose we represent an outcome value $a$ by $f(a)$. We want combining two outcomes $a$ and $b$ to correspond to the outcome $a+b$. If we combine representations by multiplying, then we require $f(a)\,f(b)=f(a+b)$.
49
+
50
+
This functional equation has a standard solution: take $f(a)=x^a$ for some base $x$. Then $x^a x^b = x^{a+b}$, exactly as required.
51
+
52
+
This gives us the key. We can represent an outcome $k$ with the term $x^k$. The variable $x$ is just a formal placeholder. Now, where do the probabilities fit in? A probability $p_k$ is the weight of the outcome $k$, so the most natural place to put it is as a coefficient: $p_k x^k$.
53
+
54
+
To represent the entire die, which can take on any of the values $\{1, 2, 3, 4, 5, 6\}$, we can simply sum up these weighted terms. This creates a polynomial, which packages all the information about our random variable into a single expression. This is the **Probability Generating Function (PGF)**.
55
+
56
+
For a discrete random variable $X$ that takes integer values with probabilities $p_k=\Pr(X=k)$, we formally define its PGF as:
57
+
58
+
$$H(x)=\sum_k p_k\,x^k.$$
59
+
60
+
Before applying this representation to dice, a quick analogy from counting clarifies why polynomials are the right container.
61
+
62
+
### A gentler start: an analogy from counting
63
+
64
+
Before diving deeper into probabilities, let's see this polynomial trick in a simpler context: counting combinations. This idea is a cornerstone of **[combinatorics](https://en.wikipedia.org/wiki/Combinatorics)** for solving counting problems.
65
+
66
+
Imagine you want to count the number of ways to form 10 Rupees using a collection of 1, 2, and 5 Rupee coins. This is a classic combinatorial problem. We can represent the available coins as polynomials:
49
67
50
68
-**1 Rupee Coins:** $(1 + x^1 + x^2 + \dots)$
51
69
-**2 Rupee Coins:** $(1 + x^2 + x^4 + \dots)$
52
70
-**5 Rupee Coins:** $(1 + x^5 + x^{10} + \dots)$
53
71
54
72
When you multiply these polynomials, voila! The coefficient of $x^{10}$ in the final product gives you the exact number of ways to make change for 10 Rupees.
55
73
56
-
Why does this work? The exponents *add*, just like the coin values do. The polynomial multiplication automatically explores every single combination of choices for you.
74
+
Why does this work? The exponents *add*, just like the coin values do. The polynomial multiplication automatically explores every single combination of choices for you. A PGF does the exact same thing, but the coefficients are probabilities instead of just 1s for counting.
57
75
58
-
A PGF does the exact same thing, but for probabilities. For a discrete random variable $X$ that takes integer values with probabilities $p_k=\Pr(X=k)$, we define:
59
-
$$H(z)=\sum_k p_k\,z^k.$$
76
+
#### Why is this natural?
60
77
61
-
Why is this natural?
78
+
-**Labels, not powers:** The exponent $k$ is just a label for the outcome $X=k$. The polynomial is a fancy storage system where the coefficient of $x^k$ holds the probability of the outcome $k$.
79
+
-**Convolution by multiplication:** This is the killer feature. If $X$ and $Y$ are independent, the probability that $X+Y=n$ is found by summing over all pairs of outcomes that add to $n$: $\sum_k \Pr(X=k)\Pr(Y=n-k)$. This operation is called a **[convolution](https://en.wikipedia.org/wiki/Convolution)**. When you multiply the PGFs $H_X(x)$ and $H_Y(x)$, the rule for multiplying polynomials does *exactly the same calculation*. The coefficient of $x^n$ in the product is precisely that sum! So, for sums of independent variables, the PGF of the sum is the product of the PGFs:
80
+
$$H_{X+Y}(x)=H_X(x)\,H_Y(x).$$
81
+
-**From coins to dice:** For a biased coin where heads $=1$ and tails $=0$, with $\Pr(1)=p$, the PGF is $H(x) = (1-p)x^0 + px^1$. For two flips, the PGF is $H(x)^2 = (1-p)^2 + 2p(1-p)x^1 + p^2x^2$. The coefficients are the binomial probabilities! This idea scales perfectly to dice or any other discrete distribution.
62
82
63
-
-**Labels, not powers:** The exponent $k$ is just a label for the outcome $X=k$. The polynomial is a fancy storage system where the coefficient of $z^k$ holds the probability of the outcome $k$.
64
-
-**Convolution by multiplication:** This is the killer feature. If $X$ and $Y$ are independent, the probability that $X+Y=n$ is found by summing over all pairs of outcomes that add to $n$: $\sum_k \Pr(X=k)\Pr(Y=n-k)$. This operation is called a **[convolution](https://en.wikipedia.org/wiki/Convolution)**. When you multiply the PGFs $H_X(z)$ and $H_Y(z)$, the rule for multiplying polynomials does *exactly the same calculation*. The coefficient of $z^n$ in the product is precisely that sum! So, for sums of independent variables, the PGF of the sum is the product of the PGFs:
65
-
$$H_{X+Y}(z)=H_X(z)\,H_Y(z).$$
66
-
-**From coins to dice:** For a biased coin where heads $=1$ and tails $=0$, with $\Pr(1)=p$, the PGF is $H(z) = (1-p)z^0 + pz^1$. For two flips, the PGF is $H(z)^2 = (1-p)^2 + 2p(1-p)z^1 + p^2z^2$. The coefficients are the binomial probabilities! This idea scales perfectly to dice or any other discrete distribution.
83
+
For distributions that can take negative integer values, the PGF becomes a **[Laurent series](https://en.wikipedia.org/wiki/Laurent_series)** with negative powers, like $H(x) = \sum_{k=-\infty}^{\infty} p_k x^k$. All the logic, including the convolution property and the Cauchy integral extractor, works exactly the same!
67
84
68
-
For distributions that can take negative integer values, the PGF becomes a **[Laurent series](https://en.wikipedia.org/wiki/Laurent_series)** with negative powers, like $H(z) = \sum_{k=-\infty}^{\infty} p_k z^k$. All the logic, including the convolution property and the Cauchy integral extractor, works exactly the same!
85
+
Let's now instantiate this for a fair six-sided die. To match common conventions in generatingfunctionology, we'll switch the placeholder from $x$ to $z$ when we write the die's blueprint $h(z)$ and use coefficient extraction $[z^n]$.
69
86
70
87
For a standard fair die, the outcomes are $\{1, 2, 3, 4, 5, 6\}$, each with probability $\tfrac{1}{6}$. The blueprint function is:
0 commit comments