Introduction to Measure Theory

Quantum theory assigns probabilities to measurement outcomes. Hence, it is essential for a quantum philosopher to study probability theory. Let us approach this from first principles.

***

Now that we have arrived at the notion that probability problems can be mapped onto measure problems, we wish to study measures of sets.

Presumably, given a set Ω\Omega (for example Ω=Rn\Omega=\bbR^n), we can define measure on every subset of Ω\Omega, right? This would indeed make our lives simple, but it is an immediate corollary of the Banach-Tarski paradoxical theorem that we cannot assign measure to every subset of R3\bbR^3. More precisely, if we want our measure to satisfy certain nice properties (such as translation and rotation invariance, finite additivity and balls having positive measure), then such a measure cannot be defined on every subset of R3\bbR^3.

At this point, we have to make a trade-off:

  • either we define the measure on every subset of a space and compromise by giving up some nice properties that we would like the measure to satisfy, or
  • we require the measure to satisfy nice properties but we compromise by only defining the measure on some (not necessarily all) subsets of the space.

All of measure theory is built from choosing the latter option: it is better to measure many relevant sets well than to measure all sets badly.

Hence, it is clear that given a set Ω\Omega, we want some notion along the lines of "which subsets of Ω\Omega can we assign measure to?" For historical reasons, we will call this collection of sets -- the sets to which we assign a measure -- a σ\sigma-algebra on Ω\Omega (pronounced "sigma algebra"), and we will denote it as Σ\Sigma. Naturally, every element in Σ\Sigma is called a measurable set.

Since this is a collection of subsets of Ω\Omega, it is a particular subset of the power set of Ω\Omega, hence we may write ΣP(Ω)\Sigma\subseteq \mathcal{P}(\Omega).

Now let us think about "what properties must such a Σ\Sigma have?" Perhaps more precisely, let us think about "suppose I call the measure μ\mu, then what is the minimum structure Σ\Sigma must have so that a measure μ:Σ\[0,]\mu:\Sigma \to \[0,\infty] is a useful notion?" Let us try to derive this from first principles, by thinking through the following questions:

  1. What is the simplest set in all of set theory? And what is a natural definition for the measure of this set?
  2. Suppose the measure of the entire space is finite, i.e., μ(Ω)\<\mu(\Omega)\<\infty1. Now suppose we are given a set AA that is known to be measurable, i.e., AΣA\in \Sigma. Is the complement Ac=ΩAA^c=\Omega\setminus A measurable? Why or why not?
  3. Now what if the measure of the entire space is infinite, i.e., μ(Ω)=\mu(\Omega)=\infty? Given a set AA that is known to be measurable, i.e., AΣA\in \Sigma, is the complement Ac=ΩAA^c=\Omega\setminus A necessarily measurable? Why or why not?
  4. Suppose the sets AA and BB are known to be measurable, i.e., AΣ,BΣA\in \Sigma, B\in \Sigma. Is their union ABA\cup B also measurable? What if AA and BB are disjoint? What if they are not disjoint?
  5. Suppose we are given a sequence of sets {An}nN\lbrace A_n \rbrace_{n\in \mathbb{N}} that are known to be measurable, i.e., n:AnΣ\forall n : A_n\in \Sigma. If it is an increasing sequence, i.e., A1A2A3A_1\subseteq A_2\subseteq A_3\subseteq \dots, then is their countable union nAn\bigcup_n A_n measurable?

***

Frederic Schuller provides this wonderful explanation 2 of why the Borel sigma algebra is so interesting.

So only a topological space can carry a Borel sigma algebra because the whole idea of the Borel sigma algebra is that you employ the open sets in order to generate it. Okay, so you can do this, the question is whether this is a particularly good idea; it turns out to be a brilliant idea, okay? So again, why would one do this in the first place? Of course, you could take a topological space and establish a sigma algebra on it that has nothing to do whatsoever with the topology. That's perfectly fine.

However, you could also have a Hilbert space with an inner product, and you could establish ... a norm that has nothing to do with the inner product. You could do that, but ... you will not get the Cauchy-Schwarz inequality where on one side you have the inner product, on the other side you have the norms, because they have nothing to do with each other.

So usually, if you have a strong structure that is able to imply another structure, you will establish that other structure in such a way that it's being induced.

Now let's start with Hilbert space. You have an inner product, from that you induce the norm. From the norm, you would induce the standard topology by defining the soft balls using the norm. Once you have the topology, you induce the Borel sigma algebra ... so you only made one choice, namely an inner product, which is a very strong choice, right?

If you made a strong choice, usually you derive the weaker structures from that. Again, why would you do that? You don't have to. Well, because then you get the stronger theorems for the relation between these structures, right? So this follows standard procedure in mathematics to induce from given structures you have already chosen to not make yet another choice, but to induce it.

Footnotes

  1. Here we are implicitly assuming that ΩΣ\Omega\in \Sigma, i.e., Ω\Omega is measurable. But since \emptyset is measurable, and Σ\Sigma is closed under complements, we know that c=Ω=Ω\emptyset^c=\Omega\setminus\emptyset = \Omega is measurable.

  2. See Schuller lecture number 5 on quantum theory