-
Notifications
You must be signed in to change notification settings - Fork 18
Expand file tree
/
Copy pathTutorial3_AnytimeValidCS.lean
More file actions
138 lines (111 loc) · 6.21 KB
/
Copy pathTutorial3_AnytimeValidCS.lean
File metadata and controls
138 lines (111 loc) · 6.21 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
import FormalSLT.AnytimeValid.SubGaussianCS
import Mathlib.Analysis.SpecialFunctions.Log.Basic
/-!
# Tutorial 3 — an anytime-valid confidence sequence
**Goal.** You are estimating a mean `θ` and want a confidence interval you may
look at *at every sample size* `n` without inflating the error: with probability
`≥ 1 − δ`, `θ` lies in the running interval `x̄_n ± w_n` for *all* `n`
simultaneously. That is a confidence sequence (CS), the anytime-valid analogue of
a fixed-`n` confidence interval. This tutorial shows the sub-Gamma CS width the
library provides, that it shrinks as data accumulates, and where the validity
guarantee lives.
This is a *tutorial*: it computes the width on concrete numbers, proves the
shrinking property (the reason you may keep peeking), and points at the coverage
theorem. Copy `worked_*` and edit the scale `σ²`, range `b`, and confidence `δ`.
Uses only:
* `FormalSLT.AnytimeValid.subGammaWidth` — the one-sided CS width
`√(2σ² log(1/δ)/n) + b log(1/δ)/(3n)`.
* `FormalSLT.AnytimeValid.anytime_valid_confidence_sequence_subGamma` — the
coverage guarantee (referenced; its hypotheses are measure-theoretic).
Everything is `[propext, Classical.choice, Quot.sound]`-clean.
-/
open scoped BigOperators
open FormalSLT.AnytimeValid
namespace FormalSLT.Tutorials.AnytimeValidCS
noncomputable section
/-! ## Step 0 — pick the parameters
Sub-Gaussian scale `σ² = 1`, range bound `b = 1`, confidence `δ = 1/20` (a 95%
sequence). The half-width at sample size `n` is
`w_n = √(2 · 1 · log 20 / n) + 1 · log 20 / (3n)`. -/
/-- Our confidence level `δ = 1/20`. `log(1/δ) = log 20 > 0`, which drives both
width terms. -/
theorem logInvDelta_pos : 0 < Real.log (1 / (1 / 20 : ℝ)) := by
rw [one_div_one_div]
exact Real.log_pos (by norm_num)
/-! ## Step 1 — the width is a real positive number
`subGammaWidth σ² b n δ` is positive whenever `0 < n` and `log(1/δ) > 0`: a
genuine, non-degenerate interval half-width. -/
/-- The CS half-width at `n = 16` for our parameters is strictly positive. -/
theorem worked_width_pos :
0 < subGammaWidth (1 : ℝ) 1 16 (1 / 20) := by
unfold subGammaWidth
have hlog : 0 < Real.log (1 / (1 / 20 : ℝ)) := logInvDelta_pos
have hsqrt : 0 < Real.sqrt (2 * 1 * Real.log (1 / (1 / 20 : ℝ)) / (16 : ℝ)) := by
apply Real.sqrt_pos.mpr; positivity
have hlin : 0 < (1 : ℝ) * Real.log (1 / (1 / 20 : ℝ)) / (3 * (16 : ℝ)) := by positivity
linarith
/-! ## Step 2 — the width shrinks as data accumulates (why you may keep peeking)
The defining advantage of a CS over a fixed-`n` interval: the half-width is
monotone *decreasing* in `n`, so accumulating data only tightens the interval,
and the coverage holds for all `n` at once. We prove the general fact that for
`0 < σ²`, `0 ≤ b`, `log(1/δ) > 0`, and `0 < m ≤ n`, the width at `n` is at most
the width at `m`. -/
/-- **The width is monotone decreasing in the sample size.** For `0 < σ²`,
`0 ≤ b`, `0 < log(1/δ)`, and `0 < m ≤ n`, `w_n ≤ w_m`. Both the `√(·/n)` term and
the `·/n` term are decreasing in `n`. -/
theorem worked_width_antitone
{sigma2 b delta : ℝ} (hσ : 0 < sigma2) (hb : 0 ≤ b)
(hlog : 0 < Real.log (1 / delta)) {m n : ℕ} (hm : 0 < m) (hmn : m ≤ n) :
subGammaWidth sigma2 b n delta ≤ subGammaWidth sigma2 b m delta := by
unfold subGammaWidth
have hmR : (0 : ℝ) < m := by exact_mod_cast hm
have hnR : (0 : ℝ) < n := lt_of_lt_of_le hmR (by exact_mod_cast hmn)
have hmnR : (m : ℝ) ≤ n := by exact_mod_cast hmn
-- The square-root term: argument is smaller at n, and sqrt is monotone.
have hsqrt_arg : 2 * sigma2 * Real.log (1 / delta) / (n : ℝ)
≤ 2 * sigma2 * Real.log (1 / delta) / (m : ℝ) := by
apply div_le_div_of_nonneg_left _ hmR hmnR
positivity
have hsqrt : Real.sqrt (2 * sigma2 * Real.log (1 / delta) / (n : ℝ))
≤ Real.sqrt (2 * sigma2 * Real.log (1 / delta) / (m : ℝ)) :=
Real.sqrt_le_sqrt hsqrt_arg
-- The linear term: also decreasing in n.
have hlin : b * Real.log (1 / delta) / (3 * (n : ℝ))
≤ b * Real.log (1 / delta) / (3 * (m : ℝ)) := by
apply div_le_div_of_nonneg_left _ (by positivity) (by linarith)
positivity
linarith
/-- **Concrete instance of shrinking:** with our parameters the width at `n = 64`
is at most the width at `n = 16`. (Numerically `0.32 ≤ 0.67`.) -/
theorem worked_width_shrinks :
subGammaWidth (1 : ℝ) 1 64 (1 / 20) ≤ subGammaWidth (1 : ℝ) 1 16 (1 / 20) :=
worked_width_antitone (by norm_num) (by norm_num) logInvDelta_pos (by norm_num)
(by norm_num)
/-! ## Step 3 — where the coverage guarantee lives
The interval `[x̄_n − w_n, x̄_n + w_n]` is the sub-Gamma confidence sequence; its
endpoints are `subGammaConfidenceLower` / `subGammaConfidenceUpper`. The
anytime-valid coverage statement is
`anytimeValidConfidenceSequence μ X θ σ² b δ`,
i.e. the *first-failure* event `{∃ n > 0, θ ∉ interval_n}` has measure `≤ δ`. It
is discharged by `anytime_valid_confidence_sequence_subGamma` from a one-sided
upper/lower split, each at budget `δ/2` (a union bound over the two tails). We
restate its type to make the guarantee visible; instantiating it requires the
measure-theoretic supermartingale hypotheses, which is beyond a getting-started
walkthrough. -/
/-- The coverage guarantee we are buying, as a `Prop`: the running interval covers
`θ` at every `n` except on an event of measure `≤ δ`. -/
def coverageGoal {Ω : Type*} [MeasurableSpace Ω]
(μ : MeasureTheory.Measure Ω) (X : ℕ → Ω → ℝ) (theta sigma2 b delta : ℝ) : Prop :=
anytimeValidConfidenceSequence μ X theta sigma2 b delta
/-! ## How to adapt this to your problem
* **Different confidence / scale / range:** edit `δ`, `σ²`, `b`; the width and the
shrinking property follow from `worked_width_antitone` with your numbers.
* **Get the interval endpoints:** evaluate `subGammaConfidenceLower X σ² b δ n ω`
and `subGammaConfidenceUpper …` on your running mean.
* **Prove coverage for your process:** supply the upper/lower failure split to
`anytime_valid_confidence_sequence_subGamma`; the supermartingale obligation is
produced by the conditional sub-Gamma extractor in
`FormalSLT.Concentration.SubGamma`.
-/
end
end FormalSLT.Tutorials.AnytimeValidCS