background¶
Metric spaces¶
A metric space is a pair \((X, d)\) where \(X\) is a set and \(d: X × X → ℝ\) is a metric (or distance function), that is, a function satisfying the following conditions for all \(x, y, z ∈ X\):
\(d(x, y) ≥ 0\)
\(d(x,y) = 0\) if and only if \(x = y\)
(symmetry) \(d(x, y) = d(y, x)\)
(triangle inequality) \(d(x, z) ≤ d(x, y)+d(y, z)\).
Let \((X, d)\) be a metric space and \(E\) a subset of \(X\).
If \(\{V_α\}\) is a family of subsets of \(X\) such that \(E ⊆ ⋃_α V_α\), then \(\{V_α\}\) is called a cover of \(E\).
The diameter of \(E\) is \(\mathrm{diam} E = \sup \{d(x, y) : x, y ∈ E\}\).
The set \(E\) is bounded if \(\mathrm{diam} E < ∞\) and totally bounded if for every \(ε > 0\) it can be covered by finitely many balls of radius \(ε\).
Compactness¶
If \(E\) is a subset of a metric space \((X, d)\), then the following are equivalent.
\(E\) is complete and totally bounded.
Every infinite set in \(E\) has a limit point in \(E\);
(Bolzano-Weierstrass) Every sequence in \(E\) has a subsequence that converges to a point of \(E\).
(Heine-Borel) If \(\{V_\alpha\}\) is a cover of \(E\) by open sets, there is a finite set \(\{\alpha_1, \dots, \alpha_n\}\) such that \(\{V_{\alpha_i}\}_{i=1}^n\) covers \(E\).
Sets that satisfying one (hence all) of the conditions in the previous theorem are called compact. It is probably most common to define compactness using the last item, the Heine-Borel property, which is can be stated simply as follows: a set is compact iff every open cover reduces to a finite subcover.
Density and Category¶
Let us recall some basic definitions.
\(G\) is a dense set in \(X\) if each \(x\in X\) is a limit point of \(G\). (Equivalently, \(\bar{G} = X\).)
\(G\) is a nowhere dense set in \(X\) if \(\bar{G}\) contains no nonempty open subsets of \(X\). (Equivalently, \(\bar{G}^o = \emptyset\).)
A set \(G\) is of the first category if it is a countable union of nowhere dense sets.
A set \(G\) is of the second category if it is not of the first category.
No nonempty complete metric space is of the first category.
In other words, if \(X\) is a complete metric space and \(\{A_n\}\) is a collection of open dense subsets, then \(⋂_{n=1}^∞ A_n\) is dense in \(X\).
There are many important consequences of the Baire category theorem. The most famous are probably the Banach-Steinhaus theorem, Open mapping theorem, Inverse mapping theorem, and Closed graph theorem.
We present versions of those important results in the appendix of theorems, but first here are two immediate corollaries of the Baire category theorem.
If \(X\) is a complete metric space and \(G ⊆ X\) is a nonempty open subset and \(G= ⋃_{n=1}^∞ G_n\) then \(Ḡ_n^° ≠ ∅\) for at least one \(n ∈ ℕ\).
A nonempty complete metric space is not a countable union of nowhere dense sets.
Continuous maps of a metric space¶
Let \((X, d_X)\) and \((Y, d_Y)\) be metric spaces.
The notation \(|x - x'|_X\), or even \(|x - x'|\), is often used in place of \(d_X(x,x')\).
A function \(f : X \to Y\) is called
continuous at the point \(x_0 \in X\) if
\[(\forall \epsilon >0)\, (\exists \delta > 0) \, (|x - x_0| < \delta \, \to \, |f(x) -f(x_0)| < \epsilon),\]continuous in \(E\subseteq X\) if it is continuous at every point of \(E\),
uniformly continuous in \(E\subseteq X\) if
\[(\forall \epsilon >0)\, (\exists \delta >0)\, (\forall x, x_0 \in E) \, (|x - x_0| < \delta \, \to \, |f(x) -f(x_0)| < \epsilon).\]
If \(f\) is continuous in a compact set, then it is uniformly continuous in that set.
Topological spaces¶
For a topological space \((X, τ)\) and a point \(x ∈ X\), a collection \(ℬ_x\) of neighborhoods of \(x\) is called a base for the topology at \(x\) provided for any neighborhood \(V\) of \(x\), there is a set \(B ∈ ℬ_x\) for which \(B ⊆ V\). A collection \(ℬ\) of open sets is called a base for the topology \(τ\) provided it contains a base for the topology at each point.
Observe that a subcollection \(ℬ ⊆ τ\) is a base for \(τ\) if and only if every nonempty open set is the union of a subcollection of \(ℬ\).
Abstract Measure¶
Recall the definitions of a semiring of sets, a ring of sets and an algebra of sets.
It should be obvious that every ring is closed under finite unions. Also, since an arbitrary ring \(R\) is nonempty, it contains a set, say \(A ∈ R\), and so \(∅ = A - A ∈ R\). That is, every ring contains the empty set.
Also, from the identities
it easily follows that every ring is closed under symmetric differences and finite intersections.
From the identities it should also be obvious that every algebra is a ring. In the converse direction, we have the following result:
A ring of subsets of \(X\) is an algebra if and only if it contains \(X\).
Measurable Functions¶
Continuous functions of continuous functions are continuous, and continuous functions of measurable functions are measurable. We state this as
Let \(Y\) and \(Z\) be topological spaces, and let \(g: Y → Z\) be a continuous function.
If \(X\) is a topological space, if \(f: X → Y\) is continuous, and if \(h = g ∘ f\), then \(h: X → Z\) is continuous.
If \(X\) is a measurable space, if \(f: X → Y\) is a measurable function, and if \(h = g ∘ f\), then \(h: X → Z\) is measurable.
Proof.
If \(V\) is an open set in \(Z\), then \(g^{-1}(V)\) is open in \(Y\) and \(h^{-1}(V) = (g ∘ f)^{-1}(V) = f^{-1}(g^{-1}(V))\).
If \(f\) is continuous, then \(h^{-1}(V)\) is open, proving the first statement of the theorem.
If \(f\) is measurable, then \(h^{-1}(V)\) is measurable, proving the second statement of the theorem.
Note, however, that measurable functions of continuous functions need not be measurable.
Let \(u\) and \(v\) be real-valued measurable functions on a measurable space \(X\), let \(Φ\) be a continuous mapping of the plane into a topological space \(Y\), and define \(h(x) = Φ(u(x), v(x))\) for all \(x ∈ X\). Then \(h: X → Y\) is measurable.
Absolute continuity of functions¶
A function \(f\) is absolutely continuous on \([a,b]\) if and only if \(f'\) exists a.e. in \((a,b)\), \(f' ∈ L_1[a,b]\), and \(∫_a^x f'(t) \,dt = f(x) - f(a)\) for all \(a ≤ x ≤ b\).
Proof.
(⇒) Suppose \(f∈ AC[a,b]\).
Then \(f∈ BV[a,b]\), since AC implies BV. Therefore, \(f = g - h\) for some monotone increasing functions \(g\) and \(h\). By the differentiability of increasing functions \(f' = g' - h'\) exists a.e. in \((a,b)\) and \(|f'(x)| ≤ g'(x) + h'(x)\) for all \(a< x <b\).
\[\begin{split}∫_a^x |f'(t)| \,dt &≤ ∫_a^x (g'(t) + h'(t)) \, dt\\ &= ∫_a^x g'(t)\, dt + ∫_a^x h'(t) \, dt\\ &≤ (g'(x) - g'(a)) + (h'(x) - h'(a))\\ &≤ g'(x) + h'(x) - g'(a) - h'(a).\end{split}\]The second inequality follows from the theorem on differentiability of increasing functions.
Therefore, \(f'∈ L_1[a,b]\).
Let \(G(x) = ∫_a^x f'(t)\, dt\). Then \(G'(x) = f'(x)\) for almost all \(x ∈ [a,b]\), and \(G\) is absolutely continuous and so is the function \(F = f - G\).
Therefore, \(F'(x) = f'(x) - G'(x) = 0\) for almost all \(x ∈ [a,b]\), by the theorem on the derivative of the integral.
It follows from this and the lemma on functions with a.e. zero derivative that \(F\) is constant a.e. on \([a,b]\).
Since \(F(a) = f(a) + G(a) = f(a)\), we see that \(F(x) = f(a)\) for almost all \(x∈ [a,b]\). Therefore,
\[f(a) = F(a) = F(x) = f(x) - G(x) = f(x) - ∫_a^x f'(t)\, dt,\]as desired.
(⇐) Assume the stated hypotheses.
By a standard theorem, 2 \(f' ∈ L_1\) implies that for all \(ε > 0\) there is a \(δ > 0\) such that, if \(E ⊆ ℝ\) is a measurable set of measure \(m E < δ\), then
(56)¶\[∫_E |f'|\, dm < ε.\]Let \(A = ⋃_{i=1}^n (a_i,b_i)\) be a finite union of disjoint open intervals in \([a,b]\) such that \(∑_{i=1}^n (b_i-a_i) < δ\). Then \(m A ≤ ∑_{i=1}^n (b_i-a_i) < δ\), so
(57)¶\[∑_{i=1}^n |f(b_i)-f(a_i)| = ∑_{i=1}^n \left|∫_{a_i}^{b_i} f' dm\right| ≤ ∑_{i=1}^n ∫_{a_i}^{b_i} |f'| dm = ∫_A |f'| dm < ε\]by (56). Thus, \(f ∈ AC[a,b]\).
☐
The following is essentially just a rewording of the previous result in such a way that might make it easier to apply.
If \(-∞ < a < b < ∞\) and \(f: [a,b] → ℂ\), then the following are equivalent.
\(f ∈ AC[a,b]\)
\(f(x) - f(a) = ∫_a^x g(t) \, dt\) for some \(g∈ L_1([a,b], m)\).
\(f\) is differentiable a.e. on \([a,b]\), \(f' ∈ L_1([a,b],m)\) and \(∫_a^x f' \, dm = f(x) - f(a)\).
Proof.
The equivalence 1 ⟺ 2 is the theorem asserting that AC is equivalent to being an indefinite integral.
The fact that 1 ⟹ 3 follows from the forward direction of the previous theorem.
The converse, 3 ⟹ 1, is also covered by the previous theorem and its proof, but here is another argument (which is only slightly different).
Recall, if \(f ∈ L_1\) then \(∀ ε > 0\) \(∃ δ > 0\) such that if \(E\) is measurable then
Therefore, if \(\{(a_i, b_i)\}\) is a finite collection of disjoint intervals with \(∑_{i=1}^n (b_i - a_i) < δ\), then \(m(⋃_i (a_i, b_i)) < δ\), so
☐
Integration¶
There are a handful of results that are the most essential, and lay the foundation on which everything else is built. Rudin [Rud87] gives a beautifully succinct and clear presentation of these in just seven pages (pp. 21–27). 3 Some of these results are presented below, but do yourself a favor and learn from the master himself by reading [Rud87].
If \(p\) and \(q\) are positive real numbers such that \(p+q = pq\) (equivalently, \((1/p) + (1/q) = 1\)), then we call \(p\) and \(q\) a pair of conjugate exponents.
It is clear that conjugate exponents satisfy \(1 < p, q < ∞\) and that as \(p → 1\), \(q → ∞\). Thus, \(1\) and \(∞\) are regarded as conjugate exponents.
The following theorem is an essential ingredient of many proofs (e.g. the proof that simple functions are dense in \(L_p\), presented below).
If \(f: X → [0,∞]\) is a measurable function, then there exist measurable simple functions \(s_1, s_2, \dots\) on \(X\) such that
\(0 ≤ s_1 ≤ s_2 ≤ \cdots ≤ f\),
\(s_n(x) → f(x)\) as \(n → ∞\), for every \(x ∈ X\).
This is Theorem 1.17 of [Rud87], where the proof is also presented.
Here is a list of the other most important and useful results about integration.
Here is a nice application of the dominated convergence theorem; it essentially says that if \(f\) is integrable, then the majority of the integral \(∫f\) comes from integrating over a set of finite measure.
Let \((X, 𝔐, μ)\) be a measure space. If \(1 ≤ p < ∞\) and \(f∈ L_p(μ)\) and \(ε>0\), then there exists a set \(A ∈ 𝔐\) such that \(μ(A) < ∞\), \(f\) is bounded on \(A\) and \(∫_{X-A}|f|^p \,dμ < ε\).
Proof.
Case 1. Assume \(f∈ L_1\).
Define \(A_0 = \{x ∈ X : f(x) = 0\}\) and \(A_∞ = \{x : |f(x)| = ∞\}\) and
\[A_n = \{x ∈ X : |f(x)| ∈ [1/n, n]\}.\]Then \(A_1 ⊆ A_2 ⊆ \cdots\) and
\[\lim_{n→∞} A_n = 𝔄 := ⋃_{i=1}^∞ A_n = X - A_0 - A_∞.\]Note that \(A_∞\) must have measure 0, since \(f ∈ L_1\). Therefore, \(∫_{A_0 ∪ A_∞} f\, dμ = 0\), so
(58)¶\[∫_X f\, dμ = ∫_𝔄 f\, dμ + ∫_{A_0 ∪ A_∞} f\, dμ = ∫_𝔄 f \, dμ.\]Define \(g_n = f χ_{A_n}\). Then
\[|g_n(x)| = |f(x)| χ_{A_n}(x) ≤ |f(x)| \quad (∀ x ∈ X; n = 1, 2, \dots),\]and
\[\lim_n g_n(x) = f(x) χ_𝔄(x) \quad (∀ x ∈ X).\]Therefore, the dominated convergence theorem implies that \(∫_X|g_n - f χ_𝔄|\, dμ → 0\).
Next observe,
\[\begin{split}\bigl| ∫_{A_n}f\, dμ - ∫_X f\, dμ \bigr| &= \bigl| ∫_{A_n}f\, dμ - ∫_𝔄 f\, dμ \bigr| \quad \text{(by eq:`neglig`)}\\ & = \bigl| ∫_X g_n \, dμ - ∫_𝔄 f\, dμ\bigr|\\ & ≤ ∫_X |g_n - f χ_𝔄|\, dμ.\end{split}\]which tends to zero since \(g_n → f χ_𝔄\) as \(n→ ∞\).
Thus, we can choose \(N>0\) such that
\[\bigl| ∫_{X - A_N}f\, dμ \bigr| = \bigl| ∫_{A_N}f\, dμ - ∫_X f\, dμ\bigr| ≤ ∫_𝔄 |g_N - f|\, dμ < ε.\]Finally, note that, by definition of \(A_N\), we have \(1/N ≤ |f(x)| ≤ N\) (so \(f\) is bounded) on \(A_N\) and
\[\frac{1}{N} μ A_N ≤ ∫_{A_N}|f|\, dμ ≤ ∫_X |f|\, dμ < ∞.\]
Case 2. Assume \(f∈ L_p\) and \(1 < p < ∞\).
Todo
complete proof.
Approximating integrable functions by step functions¶
A property that holds for all step functions and is preserved under the taking of limits also holds for all integrable functions. This is a consequence of the following lemma. (See also the exercises in Chapter 2 of [Rud87]).
If \(f ∈ L_1(ℝ)\) then there exists a sequence \(\{g_n\}\) of step functions such that \(\lim_{n → ∞} ∫ |f-g_n| = 0\).
Proof.
We must show that there exists a sequence \(\{g_n\}\) of step functions with the following property: \(∀ ε > 0\), \(∃ N ∈ ℕ\),
We proceed by a sequence of steps in which \(f\) is assumed to have a special form. In each step the form of \(f\) is slightly more general than in the previous step.
Step 1. Suppose \(f = χ_A\) for some measurable set \(A ⊆ ℝ\).
By assumption \(f ∈ L_1(ℝ)\), so \(μ A < ∞\).
By definition of outer measure,
\[μ A = \inf \bigl\{ ∑ μ A_i ∣ \{A_i\} ⊂ S, A ⊆ ⋃ A_i\},\]where \(S = \{[a, b) : -∞ < a < b < ∞\}\).
Thus we can choose \(\{A_i\} ⊆ S\) such that \(A ⊆ ⋃ A_i\) and \(A_i ∩ A_j = ∅\) \((i ≠ j)\) and
\[μ A ≤ μ (∪ A_i) ≤ ∑ μ A_i ≤ μ A + ε/2.\]Define \(B := ⋃ A_i\). Then \(A ⊆ B\) and \(μ A ≤ μ B ≤ μ A + ε/2\).
Since \(A ⊆ B\) implies \(χ_A ≤ χ_B\), we have
\[∫ |f-χ_B| = ∫ |χ_B - χ_A| = μ B - μ A < ε/2.\]Now \(χ_B\) is not a step function as it may have infinitely many terms, so we consider
\[φ_n = ∑_{i=1}^n χ_{A_i}.\]Since the \(A_i\)’s are disjoint, we have
\[χ_B = χ_{∪A_i} = ∑_{i=1}^∞ χ_{A_i} = \lim_{n→∞} φ_n.\]By the monotone convergence theorem, \(∫ φ_n → ∫ χ_B\).
Let \(N\) be such that \(∫(χ_B - φ_n) < ε/2\) \((n ≥ N)\). Then \(∀ n ≥ N\),
\[\begin{split}∫ |χ_A-φ_n| &≤ ∫ |χ_A - χ_B| + ∫ |χ_B-φ_n|\\ &= ∫ (χ_B - χ_A) + ∫ (χ_B-φ_n) < ε.\end{split}\]Now note that \(φ_n\) is a finite linear combination of characteristic functions of bounded intervals; i.e., \(φ_n\) is a step function.
This completes the proof for the special case \(f = χ_A\) where \(A ⊆ ℝ\) is a measurable set of finite measure.
Step 2. Suppose \(f\) is a measurable simple function.
Then \(f = ∑_{i=1}^n α_i χ_{A_i}\), where each \(A_i\) is measurable.
By assumption \(f ∈ L_1(ℝ)\), so \(μ A_i < ∞\) for each \(i = 1, \dots, n\).
By Step 1, for each \(1 ≤ i ≤ n\), we can find a step function \(φ_i\) such that
\[∫ |χ_{A_i} - φ_i| < \frac{ε}{nM},\]where \(M = \max \{|α_i| : 1≤ j≤ n\}\). Then \(φ = ∑_{i=1}^n α_i φ_i\) is a step function and
\[\begin{split}∫ |f-φ| &= ∫ \bigl|∑α_i χ_{A_i} - ∑φ_i\bigr|\\ &= ∫ \bigl|∑(α_i χ_{A_i} - φ_i)|\\ &≤ ∑ α_i ∫ |χ_{A_i} - φ_i|\\ &≤ ∑ α_i \frac{ε}{nM} ≤ ε.\end{split}\]Step 3. Suppose \(f\) is a nonnegative integrable function; i.e., \(f ≥ 0\) and \(f ∈ L_1(ℝ)\).
Then \(∀ ε > 0\) there exists a simple function \(s ≤ f\) such that \(∫ s ≤ ∫ f < ∫ s + ε/2\). Whence, \(0 ≤ ∫ (f - s) < ε/2\).
Also, the assumption \(s ≤ f ∈ L_1(ℝ)\) implies \(s ∈ L_1(ℝ)\).
By Step 2, there exists a step function \(φ\) such that \(∫ (s - φ) < ε/2\). Therefore,
\[∫ |f-φ| = ∫ |f - s + s - φ| ≤ ∫ (f - s) + ∫ |s - φ| < ε.\]Step 4. Let \(f ∈ L_1(ℝ)\). Write \(f = f^+ - f^-\) where \(f^+\) and \(f^-\) are nonnegative integrable functions on \(ℝ\).
Then, for all \(ε > 0\) there exist simple functions \(φ, ψ\) such that
\[∫ |f^+-φ| < ε/2 \; \text{ and } \; ∫ |f^--ψ| < ε/2.\]Whence,
\[\begin{split}∫ |f - (φ - ψ)| &= ∫ | f^+ - f^- - φ + ψ| \\ &= ∫ | (f^+ - φ) + (ψ - f^-) \\ &≤ ∫ | f^+ - φ| + ∫|ψ - f^-| < ε. \\\end{split}\]Thus, \(g := φ-ψ\) is a simple function such that \(∫|f - g| < ε\). ☐
Fubini and Tonelli Theorems¶
We present this version of the Fubini/Tonelli theorems in the appendix of theorems.
Linear Spaces and Functionals¶
The main reference for this section is [Rud87].
Notation. (cf. [Rud87], 2.9) The support of a complex function \(f\) on a topological space \(X\) is the closure of the set \(\{x:f(x) \neq 0\}\).
The collection of all continuous complex functions on \(X\) whose support is compact is denoted by \(C_c(X)\).
Observe that \(C_c(X)\) is a vector space because,
the support off \(f + g\) lies in the union of the respective supports of \(f\) and \(g\), and any finite union of compact sets is compact, and
the sum of two continuous complex functions is continuous, as are scalar multiples of continuous functions.
There are at least two useful versions of the famous representation theorem of F. Riesz. We unimaginatively call these the Riesz representation theorem and Riesz representation theorem (version 2) in the appendix of theorems.
Another result that everyone should know is the Hahn-Banach theorem.
Consequences of the Baire Category Theorem¶
Here are the four famous consequences of the Baire category theorem that we mentioned above.
Hilbert Space¶
A Hilbert space is a normed linear space whose norm arises from an inner product.
An (infinite) Hilbert space is called separable if it has a countable orthonormal basis \(S\).
(We adopt the convention that the term separable is only applied to infinite Hilbert spaces. 6 )
Two Hilbert spaces \(ℋ_1, ℋ_2\) are called isometrically isomorphic if there exists a unitary operator \(U: ℋ_1 ↠ ℋ_2\).
In other words, \(U\) is a surjective isometry from \(ℋ_1\) to \(ℋ_2\), which means that \(U\) is a linear surjection that “preserves the inner product” in the following sense: \(⟨ Ux, Uy ⟩_{ℋ_2} = ⟨ x, y ⟩_{ℋ_1}\).
Footnotes
- 2
This “standard theorem” appears often on exams (see, e.g., Problem 6), but in a slightly weaker form in which the conclusion is that \(|∫_E f' \, dm| < ε\). In the present case we need \(∫_E |f'|\, dm < ε\) to get the sum in (57) to come out right.
- 3
Study these seven pages until you can recite all seven theorems and their proofs in your sleep. Also, pay close attention to the details. Rudin is careful to choose definitions and hypotheses that lend themselves concise exposition, usually without too much loss of generality. For example, he often takes the range of a “real-valued” function to be \([-\infty, \infty]\), rather than \(\mathbb R\). It is instructive to pause occasionally and consider how his arguments depend on such choices.
- 6
If we allowed a finite Hilbert space in the definition, then it would automatically be separable, so the concept is not interesting in the finite case.
Please email comments, suggestions, and corrections to williamdemeo@gmail.com