Constructive analysis

This talk on Constructive analysis was held on Wednesday November 1, 2017 in MC 4045. The talk was given by Fengyang Wang.

Abstract

Constructive mathematics, as the name would suggest, is centered on the philosophy that mathematical proofs should be able to be turned into algorithms. We will contextualize constructive approaches to analysis, roughly following Bridges and Vîţă. This talk has no formal prerequisites beyond an elementary understanding of the real numbers and the usual concept of completeness. In particular, no logical background is assumed; intuitionistic logic will be overviewed in the talk. We will finish with a discussion of the ramifications of completeness of the real numbers.

Summary

The slides for this talk are also available.

Introduction to Constructive Mathematics

It was proven by Cantor that a good math joke exists. Unfortunately, his proof was entirely non-constructive.

Attribution: Jon

My homework does exist, but no one can find it.

Attribution: Printed in The Times of London, February 3, 2004. Obtained from Bridges and Vîţă.

Constructive mathematics is mathematics that avoids obscuring the underlying algorithms and their computational information. A constructive proof is, or is readily turned into, an algorithm to find the thing it’s claiming.

It is easier to understand constructivism by looking at what procedures in proofs are not constructive. Here is a theorem in elementary calculus:

Theorem. There exist $\{a, b\} ⊆ 𝐑^+ - 𝐐^+$ such that $a^b ∈ 𝐐^+$. (There are two irrational numbers $a$, $b$ with $a^b$ rational.)

Here is a non-constructive proof:

Proof. Consider $\sqrt{2}^\sqrt{2}$. If it is rational, we are done. Otherwise, ${\left(\sqrt{2}^\sqrt{2}\right)}^{\sqrt{2}} = \sqrt{2}^2 = 2$ is rational, and we are done.

This proof is classically valid. But imagine we needed these numbers $a$, $b$ for the correctness of some algorithm. Which numbers would we use: $a = b = \sqrt{2}$ or $a = \sqrt{2}$ and $b = \sqrt{2}^{\sqrt{2}}$? Here shows the difficulty of a non-constructive proof: while it makes the claim that the requested numbers $a$ and $b$ exist, it does not give any way to construct those numbers. Thus it is unsuitable for algorithmic purposes.

Here is a constructive proof.

Proof. Let $a = \sqrt{2}$ and $b = \frac{\log{9}}{\log{2}}$. Then $a^b = 3$ which is rational, and we are done.

This proof is constructive because we can immediately use it in any algorithm requesting such numbers $a$ and $b$, as they are constructed in the proof.

Let us try to turn the first proof into an algorithm, so we can better see what goes wrong.

# Compute `a, b` where `a` and `b` are irrationals such that `a^b` is rational.

if (√2) ^ (√2) ∈ 𝐐
    a = √2
    b = √2
else
    a = (√2) ^ (√2)
    b = √2
end

If we try to implement this algorithm, we see immediately what is wrong: implementing the condition (√2) ^ (√2) ∈ 𝐐 is hard. To have such an algorithm that can determine whether (√2) ^ (√2) is rational, we would need to prove that it is rational, or that it is irrational.

Note that in a non-constructive logic, the proof was valid because we could simply assume that a correct decision can be made — though the proof does not tell us how to make it. This is called the Law of the Excluded Middle, an inference rule of non-constructive mathematics:

\[⊢ P ∨ ¬P\]

The Law of the Excluded Middle, from an algorithmic standpoint, is equivalent to total omniscience. If for any proposition $P$ we had an algorithm that determines the truth of $P$, then using this law is valid constructively — as we would simply ask that algorithm which branch to take.

Unfortunately, modern computers do not support omniscience, so the Law of the Excluded Middle cannot be used in constructive proofs.

Logic

There is not a single theory of constructivism; different mathematicians have developed different approaches. Many modern mathematicians subscribe to a theory of constructivism built on intuitionistic logic: in particular, the BHK interpretation.

The ideas behind the BHK interpretation are very simple:

A proof of $a ∧ b$ is a proof of $a$ and a proof of $b$
A proof of $a ∨ b$ is a proof of $a$ or a proof of $b$
A proof of $∀ x P(x)$ is an algorithm that given any $x$, creates a proof of $P(x)$
A proof of $∃ x P(x)$ is a witness $x$ and a proof of $P(x)$
A proof of $a ⇒ b$ is an algorithm that given a proof of $a$ creates a proof of $b$
A proof of $¬a$ is a refutation of $a$

Here, a witness is a value that can be computed by an algorithm that demonstrates constructively the existential proposition.

Note the two major differences from classical logic: in BHK, a proof of a disjunction is a proof of one of the disjuncts. This makes more clear why the Law of the Excluded Middle must be excluded from the BHK interpretation, as taking $⊢ P ∨ ¬P$ as an axiom results in something that purports to be a proof of $P ∨ ¬P$, but is neither a proof of $P$, nor is it a proof of $¬P$. Furthermore, a proof of an existential must explicitly give a witness. Many classical proofs of existentials do not accomplish that.

We should clarify the meaning of the word refutation used above. A refutation of $a$ is an algorithm that given a proof of $a$ constructs a contradiction. There is special syntax commonly used to denote contradictions: $⊥$. Therefore, we can alternatively interpret $¬a$ as $a ⇒ ⊥$.

Limited Omniscience

Although we used the law of the excluded middle in our non-constructive proof above, that does not mean that the statement cannot be proved in constructive mathematics. It merely means that the particular proof we used involved total omniscience. (In fact, we did give a constructive proof of this shortly after.)

Can every statement of classical mathematics be proven constructively, even if the currently known proofs involve invoking some form of omniscience? Well, some statements are themselves forms of omnisicence. (For example, it is hard to envision the law of the excluded middle being provable constructively, as that would imply an algorithm to decide the truth of any logical formula.)

We can measure the degree to which a statement $P$ of classical mathematics is not provable constructively by showing that $P ⇒ Q$ constructively, where $Q$ is some principle that seems to require a certain degree of omniscience. In the most general case, we might show that a statement $P$ implies the full law of the excluded middle. However, it is possible that $P$ implies some other form of omniscience that is weaker than being able to decide the truth of anything.

Limited Principle of Omniscience

\[∀𝐱∈\{0, 1\}^𝐍 (𝐱 = 𝟎 ∨ 𝐱 ≠ 𝟎)\]

Here we must be careful to define the equality and inequality of binary sequences. We define equality to be

\[𝐱 = 𝐲 ⇔ ∀i∈𝐍 (x_i = y_i)\]

and inequality to be

\[𝐱 = 𝐲 ⇔ ∃i∈𝐍 (x_i ≠ y_i)\]

Note that the natural inequality is not the same as a refutation of equality; in fact it is stronger. (We will see this come up again with real numbers.) This kind of inequality is called a “tight apartness”. A simple refutation of equality, $¬(a = b)$, would be called the “denial inequality”.

An informal way to understand this principle is that, given an infinitely tall haystack, either it has no needles or it has a needle. No algorithm can decide this in general, as it would involve scanning through infinitely many 0s. Thus this is a form of omniscience (knowing an infinite number of things), though less powerful than the total omniscience of the Law of the Excluded Middle.

Lesser Limited Principle of Omniscience

\[∀𝐱∈\{0, 1\}^𝐍 (¬∃i∈𝐍,j∈𝐍(i≠j ∧ x_i = y_i = 1) ⇒ (∀i∈𝐍(x_{2i} = 0) ∨ ∀i∈𝐍(x_{2i-1} = 0)))\]

This statement is rather complicated, but can be understood less formally as: given two infinitely tall haystacks, if you know that there is at most one needle among them both, then either the left haystack has no needles or the right haystack has no needles. To decide this would seemingly also require scanning a potentially unbounded amount of time to find the right haystack (since it is possible that neither haystack contains a needle), and hence is also a form of omniscience. However, it is weaker than the LPO.

Markov Principle

Even weaker still is the so-called Markov principle,

\[∀𝐱∈\{0, 1\}^𝐍 (¬(𝐱 = 𝟎) ⇒ 𝐱 ≠ 𝟎)\]

The Markov principle is the statement that in fact the double negative does hold for equality of binary sequences; furthermore it shows that the denial inequality and tight apartness are the same in these cases. It would seem like there does exist an algorithm to find a witness to the inequality of two sequences: simply search through the sequence and wait to find the first 1. This should terminate because since the sequence is not all zeros, there must be a 1! However, note that this reasoning involves the use of a classical logic non-constructive argument: that is, the termination of the algorithm isn’t explicitly shown by providing a witness bounding the amount of time it will take, but shown through a classical-style contradiction argument. Hence the Markov principle only holds in the BHK interpretation if the logic you use to reason about it is a classical metatheory, and is otherwise not part of the BHK interpretation.

Brouwerian Counterexamples

A Brouwerian counterexample for a statement $P$ is a proof that $P ⇒ \mathsf{LEM}$ or $P ⇒ \mathsf{LPO}$ or $P ⇒ \mathsf{LLPO}$ or even $P ⇒ \mathsf{MP}$. Brouwerian counterexamples show that particular statements are not provable constructively, because they imply omniscience principles that are inherently non-constructive.

These are useful to identify what statements from classical logic must be modified to fit within a constructive framework. Indeed, if a statement does not have a Brouwerian counterexample, it may be provable constructively even if the standard classical proof is not constructive — perhaps the constructive one is just a little harder to find. But if a statement does have a Brouwerian counterexample, then it cannot be proven in a constructive framework unless somehow the omniscience principles can be proven.

Construction and Operations on the Real Numbers

We’ll construct a real number as a set of rational intervals such that any two intervals overlap, and there are intervals shorter than any given positive rational. Intuitively, a real number is given by a set of rational interval approximations. Thinking of it from an interface standpoint, we want to be able to get arbitrarily small rational brackets for the real number:

abstract type RealNumber end

"""
    approximate(x::RealNumber, ɛ)

Return a rational interval of length at most ɛ containing x.
"""
function approximate(x::RealNumber, ɛ::Rational)
    ...
end

Here is a definition in the more usual symbology:

Each $𝐱∈𝐑$ is a set of rational intervals $(q, q') ∈ 𝐐^2$ such that:

$q ≤ q'$
$∀ (q, q') ∈ 𝐱(∀ (p, p') ∈ 𝐱(q ≤ p' ∧ p ≤ q'))$
$∀ ɛ∈ 𝐐^+ (∃ (q, q') ∈ 𝐱 (q' - q < ɛ))$

Initially we will use boldface $𝐱$ to denote real numbers. This will help to disambiguate the differences between reals and rationals.

We will need an equivalence relation on this set to recover the reals. We can define it quite naturally: two reals $𝐱$ and $𝐲$ are equivalent if every possible approximation for them agree; that is, every interval in the sets overlaps. More symbolically:

\[𝐱 = 𝐲 := ∀(p, p')∈𝐱 ∀(q, q')∈𝐲 (\max\{p, q\} ≤ \min\{p', q'\})\]

We must be careful with the denial inequality $¬(𝐱 = 𝐲)$: as we have mentioned earlier, if $¬¬(𝐱 = 𝐲) ⇒ (𝐱 = 𝐲)$, then in fact we have the Markov principle. This is an example of a Brouwerian counterexample — our first!

The tight apartness relation, which we will denote as $𝐱 ≠ 𝐲$, does satisfy $¬(𝐱 ≠ 𝐲) ⇒ (𝐱 = 𝐲)$. The definition of this will be that two real numbers are apart if there are intervals that do not overlap. Another way to express that would be $|𝐱 - 𝐲| > 0$, but we cannot define that until later.

We have hinted above at a notion of apartness based on whether intervals do not overlap; we can similarly define an order with the same idea, except directionally:

$𝐱 < 𝐲$ if there are intervals $(p, p') ∈ 𝐱$ and $(q, q') ∈ 𝐲$ that satisfy $p' < q$
$𝐱 ≤ 𝐲$ if all intervals $(p, p') ∈ 𝐱$ and $(q, q') ∈ 𝐲$ satisfy $p ≤ q'$
$𝐱 > 𝐲$ if there are intervals $(p, p') ∈ 𝐱$ and $(q, q') ∈ 𝐲$ that satisfy $p > q'$
$𝐱 ≥ 𝐲$ if all intervals $(p, p') ∈ 𝐱$ and $(q, q') ∈ 𝐲$ satisfy $p' ≥ q$

It is immediate that $𝐱 ≮ 𝐲 ⇔ 𝐱 ≥ 𝐲$ (we need only the decidability of $≥$ over the rationals to prove this fact). However, some classical properties of these orders do not hold. For example $𝐱 ≤ 𝟎 ∨ 𝐱 ≥ 𝟎$ implies $\mathsf{LLPO}$, and $𝐱 ≥ 𝟎 ⇒ (𝐱 > 𝟎 ∨ 𝐱 = 𝟎)$ implies $\mathsf{LPO}$. This also means, with the tight apartness above, that $𝐱 = 𝟎 ∨ 𝐱 ≠ 𝟎$ implies $\mathsf{LPO}$. These are also Brouwerian counterexamples.

We finish with some definitions for convergent and Cauchy sequences, which remind us of the classical definitions, because they are the same:

$(𝐱_n) ∈ 𝐑^𝐍$ converges to $𝐱_∞$ if for all $ɛ > 0$ there exists $N∈𝐍$ after which any term $𝐱_n$ ($n ≥ N$) is within $ɛ$ of $𝐱_∞$
$(𝐱_n) ∈ 𝐑^𝐍$ is Cauchy if for all $ɛ > 0$ there exists $N∈𝐍$ after which any two terms $𝐱_n$, $𝐱_m$ ($n ≥ m > N$) are within $ɛ$ of each other

Completeness of the Real Numbers

We are going to provide sketch proofs of various facts that build up, constructively, the completeness of $𝐑$, as well as the Archimedean property. For full proofs of the following theorems, consult Bridges and Vîţă. We will give a short overall sketch.

(Theorem: Archimedean Property) For each real number $𝐱$ there exists a positive integer $n$ with $|𝐱| < n$. (Bridges and Vîţă Lemma 2.1.15)

The proof of this is straightforward; we take any rational interval $(q, q') ∈ 𝐱$ and find $n$ with $\max\{|q|, |q'|\} < n$. The rest can be done with rational arithmetic. ∎

(Corollary) For each real number $𝐱$ there exists $n∈𝐍$ such that whenever $(q, q') ∈ 𝐱$ and $q' - q < 1$, we have $\max\{|q|, |q'|\} < n$. (Bridges and Vîţă Lemma 2.1.20)

This is quite an easy corollary of the Archimedean Property (take the $n$ from there and add 1), but it is a stronger result that we will use later on. ∎

(Theorem) If $(𝐱_n)_{n∈𝐍} ∈ 𝐑^𝐍$ is Cauchy, then there exists $𝐱_∞∈𝐑$ such that $\lim_{n→∞} 𝐱_n = 𝐱_∞$. (We call this the completeness of the real numbers. Intuitively, it means there are no gaps.) (Bridges and Vîţă Theorem 2.1.21)

For the proof, let $(𝐱_n)_{n∈𝐍}$ be a Cauchy sequence.

Compute an integer function $k \mapsto n_k$ such that each $n_k$ satisfies for all $m\ge n_k$, $n\ge n_k$, $|𝐱_m - 𝐱_n| < 2^{-k}$.

For each $k$, construct a sequence of intervals $(q_k, q'_k) ∈ 𝐱_{n_k}$ such that ``q'_k

q_k < 2^{-k}``.

Consider $r_k = q_k - 2^{-k}$, $r'_k = q'_k + 2^{-k}$. For all $n≥n_k$, we have

\[r_k \le 𝐱_{n_k} - 2^{-k} < 𝐱_n < 𝐱_{n_k} + 2^{-k} ≤ r_k\]

So actually for all $j ≥ k$:

\[𝐱_{n_j} ∈ [r_j, r'_j] ∩ [r_k, r'_k]\]

So the real number $𝐱_∞$ defined by $\{(r_k, r'_k) : k \ge 1\}$ is in fact a real number.

From our construction, we have $𝐱_n \in [r_k, r'_k]$ for all $n \ge n_k$. Then

\[\forall k \forall n≥n_k (|𝐱_n - 𝐱_∞| ≤ r'_k - r_k < 2^{-k+2})\]

Thus indeed $(𝐱_n)_{n∈𝐍}$ converges to $𝐱_∞$ as required. ∎