SLIDE 1
Program extraction from constructive proofs
Helmut Schwichtenberg
Mathematisches Institut der Universit¨ at M¨ unchen
SLIDE 2
The foundational crisis Some basic facts from mathematical logic Undefinability of truth G¨
- del’s incompleteness theorems
Has Hilbert’s programme failed?
SLIDE 3 The foundational crisis
Antinomies ∼ 1900, e.g. Russell’s: Let x0 := { x | x / ∈ x }. Then x0 ∈ x0 ⇐ ⇒ x0 / ∈ x0. Zermelo 1904: Proof that R can be well-ordered, using AC. Hilbert’s programme ∼ 1920: show that the use of ideal objects in proofs of theorems with a concrete meaning can be eliminated (example: Nullstellensatz), such that only “finitistic” methods are used. G¨
- del 1931: his second incompleteness theorem showed that this is
impossible.
SLIDE 4
Formal languages
Here: on natural numbers. Variables: x, y, z Function symbols: +, ∗, S, 0 Terms: x | 0 | r + s | r ∗ s | S(r) Numerals are special terms: for a ∈ N let a be defined by 0 := 0, n + 1 := S(n). Formulas: r = s | A ∧ B | A ∨ B | A → B | ¬A | ∀xA | ∃xA. Closed formula (sentence): Formula without free variables.
SLIDE 5
Examples
x < y := ∃z(z = 0 ∧ x + z = y) y | x := ∃z(y ∗ z = x) x prime number := 1 < x ∧ ∀y(y | x → y = 1 ∨ y = x) There are inifinitely many primes: ∀x∃y(x < y ∧ y prime)
SLIDE 6
Semantics
Let M = (|M|, 0M, SM) be a structure for the language. Here: |M| = N, 0M = 0, SM(a) = a + 1. Notion of truth for M: Th(M) := { A | A closed formula such that M | = A } R ⊆ N definable: there is AR(z) such that R = { a ∈ N | M | = AR(a) } R ⊆ Nk definable: similar
SLIDE 7 Undefinability of truth
Enumeration of formulas: A → A
Th(M) := { A | A closed formula such that M | = A } is undefinable. Fixed point lemma. For B(z) one can find a closed formula A such that M | = A iff M | = B(A).
SLIDE 8
Proof of Tarski’s theorem
Assumption: Th(M) is definable, say by BW (z). Then for all closed formulas A M | = A iff M | = BW (A). Consider the formula ¬BW (z). By the fixed point lemma we have a closed formula A such that M | = A iff M | = ¬BW (A). Contradiction.
SLIDE 9 Decidability, Enumerability
M ⊆ N decidable: there is an algorithm that terminates on input a and determines whether or not a∈M. Easy: M decidable ⇒ M definable.
- Corollary. Th(M) is undecidable.
M ⊆ N enumerable: there is an algorithm that terminates on input a iff a∈M. Easy: M enumerable ⇒ M definable.
- Corollary. Th(M) is not enumerable.
SLIDE 10
Formal proofs
Truth → Derivability in a formal theory T. Axioms: e.g. A(0) ∧ ∀x(A(x) → A(S(x))) → ∀xA(x) Rules: e.g. modus ponens. Assumptions on T: T axiomatized, i.e. BewT(n, m) decidable. T consistent. T proves the axioms of Robinsons Q. Goal: T is incomplete.
SLIDE 11
Robinson’s Q
S(x) = 0, S(x) = S(y) → x = y, x + 0 = x, x + S(y) = S(x + y), x · 0 = 0, x · S(y) = x · y + x, ∃z (x + S(z) = y) ∨ x = y ∨ ∃z (y + S(z) = x).
SLIDE 12 Incompleteness
Theorem (G¨
One can find a closed formula A such that ⊢T A and ⊢T ¬A.
- Proof. Auxiliary claim: every decidable relation R is
“representable” in T, by a formula BR( x). Syntactic fixed point lemma. For B(z) one can find a closed formula A such that ⊢T A ↔ B(A). BewT(n, m) decidable ⇒ WdlT(n, m) decidable.
SLIDE 13 Proof of the incompleteness theorem
T ⊢ ∀x
- x < n → x = 0 ∨ · · · ∨ x = n − 1
- ,
T ⊢ ∀x
- x = 0 ∨ · · · ∨ x = n ∨ n < x
- .
Let BBewT (x1, x2) and BWdlT (x1, x2) be formulas representing BewT and WdlT. By the (syntactic) fixed point lemma we have a closed formula A such that T ⊢ A ↔ ∀x
- BBewT (x, A) → ∃y(y < x ∧ BWdlT (y, A))
- .
A expresses its own underivability: “For every proof of me there is a shorter proof of my negation”. One can show (∗) T ⊢ A and (∗∗) T ⊢ ¬A.
SLIDE 14 G¨
- del’s second incompleteness theorem
provides an interesting alternative to the G¨
a formula ConT expressing the consistency of T. Lemma (Σ1-completeness of Q). Let A(x1, . . . , xn) be a Σ1-formula true for a1, . . . , an. Then Q ⊢ A(a1, . . . , an). Lemma (Formalized Σ1-Completeness). In an appropriate theory T
- f arithmetic with induction, we can formally prove for any
Σ1-formula A A( x) → ∃pBewT(p, A(˙
SLIDE 15 G¨
- del’s second incompleteness theorem (continued)
Let T ⊇ Q be an axiomatized consistent theory, with “enough” induction to formalize Σ1-completeness. Define ThmT(x) := ∃y BewT(y, x), ConT := ¬∃y BewT(y, ⊥), A := ThmT(A). Derivability conditions for T (Hilbert-Bernays): T ⊢ A → A (A closed Σ1-formula), T ⊢ (A → B) → A → B. Theorem (G¨
- del). Let T be as above, satisfying the derivability
- conditions. Then
T ⊢ ConT.
SLIDE 16 Has Hilbert’s programme failed?
- No. There are directly justifiable and constructively acceptable
proof methods which go beyond a given theory T, that is are not formalizable in T. Example (Gentzen): transfinite induction up to ε0 and Peano arithmetic. Kreisel’s question. What more do we know if we have proved a theorem with restricted means, rather than only knowing that it is true?
SLIDE 17
- 2. Program extraction from constructive proofs
Classical versus constructive proofs. Kreisel’s counterexample Proof terms The type of a formula Computational content of a proof Realizability, soundness
SLIDE 18 Example of a non-constructive proof
Lemma
There are irrational numbers a, b such that ab is rational.
Proof.
Case √ 2
√ 2 rational. Let a =
√ 2 and b = √
irrational, and by assumption ab is rational. Case √ 2
√ 2 irrational. Let a =
√ 2
√ 2 and b =
√
assumption a, b are irrational, and ab = √ 2
√ 2√ 2
= √ 2 2 = 2 is rational.
SLIDE 19 Kreisel’s counterexample
Define the classical existential quantifier by ∃clxA := ¬∀x¬A. We show: ⊢ ∀x∃clyA generally does not yield a program to compute y from x. Consider ⊢ ∀x∃cly
- T ¬(x, y) → ∀zT ¬(x, z)
- .
Let T ¬(x, y) mean: y is not the number of a terminating computation of the Turing machine with number x, on input x.
- Lemma. There is no computable f satisfying
T ¬(x, f (x)) → ∀zT ¬(x, z).
- Proof. Otherwise T ¬(x, f (x)) ↔ ∀zT ¬(x, z), contradicting
Church’s theorem (∀zT ¬(x, z) is undecidable).
SLIDE 20
Programs from constructive proofs
Constructive logic = classical logic + ∃. Undecidable, whether a program meets its specification. Formal proof: Correctness can be checked easily. proof = program with sufficiently many comments (more precisely: a program can be extracted). Vision: Use mathematical culture to organize complex structures, for the purpose of program extraction
SLIDE 21
Proof terms: assumption variables, conjunction ∧
u : A uA | M A | N B ∧+ A ∧ B MA, NBA∧B | M A ∧ B ∧− A | M A ∧ B ∧−
1
B (MA∧B0)A (MA∧B1)B
SLIDE 22
Proof terms for →
[u : A] | M B →+ u A → B (λuAMB)A→B | M A → B | N A →− B (MA→BNA)B
SLIDE 23
Proof terms for ∀
| M A ∀+ x (VarC) ∀xA (λxMA)∀xA (VarC) | M ∀xA t ∀− Ax[t] (M∀xAt)Ax[t] Axioms for ∃: ∃+
x,A :
∀x.A → ∃xA ∃−
x,A,B :
∃xA → (∀x.A → B) → B (x / ∈ FV(B))
SLIDE 24 The type of a formula
Kolmogorov: Formulas = problems. Example ∀x∃y(x < y ∧ y prime) Formulas: P( r ) | A ∧ B | A → B | ∀xρA | ∃xρA. τ(A) := type of the program to be extracted from a proof of A, or := ε if proofs of A have no “computational content” (example: ∀n f (n) = 0). τ(P( r )) := ε (P a predicate constant) τ(∃xρA) :=
if τ(A) = ε ρ × τ(A)
SLIDE 25 The type of a formula (ctd.)
τ(∀xρA) :=
if τ(A) = ε ρ ⇒ τ(A)
τ(A0 ∧ A1) :=
if τ(A1−i) = ε τ(A0) × τ(A1)
τ(A → B) := τ(B) if τ(A) = ε ε if τ(B) = ε τ(A) ⇒ τ(B)
SLIDE 26 Computational content of a proof
[ [M] ]: τ(A), for M : A derivation (natural deduction style, written as a λ-term), and τ(A) = ε. [ [uA] ] := xτ(A)
u
(xτ(A)
u
uniquely associated with uA) [ [λuAM] ] :=
[M] ] if τ(A) = ε λxτ(A)
u
[ [M] ]
[ [MA→BN] ] :=
[M] ] if τ(A) = ε [ [M] ][ [N] ]
SLIDE 27 Computational content of a proof (ctd.)
[ [MA0
0 , MA1 1 ]
] :=
[Mi] ] if τ(A1−i) = ε [ [M0] ], [ [M1] ]
[ [MA0∧A1i] ] :=
[M] ] if τ(A1−i) = ε [ [M] ]i
[ [(λxρM)∀xA] ] := λxρ[ [M] ] [ [M∀xAr] ] := [ [M] ]r. Also: extracted terms for induction, cases, ∃-axioms. For M : A where τ(A) = ε let [ [M] ] := ε (new symbol).
SLIDE 28 Definition of realizability
Every constructive existence proofs contains an algorithm. A realizability interpretation (Kleene, Kreisel, Troelstra) makes it explicit. r mr A, where r is a term of type τ(A) (or = ε). ε mr P( r ) = P( r ), r mr (∃xA) =
if τ(A) = ε r1 mr Ax[r0]
SLIDE 29 Definition of realizability (ctd.)
r mr (∀xA) =
if τ(A) = ε ∀x.rx mr A
r mr (A→B) = ε mr A → r mr B if τ(A) = ε ∀x.x mr A → ε mr B if τ(A)=ε= τ(B) ∀x.x mr A → rx mr B
r mr (A0∧A1) = ε mr A0 ∧ r mr A1 if τ(A0) = ε r mr A0 ∧ ε mr A1 if τ(A1) = ε r0 mr A0 ∧ r1 mr A1
SLIDE 30
Soundness
Let xu := ε if uA is an assumption variable with τ(A) = ε.
Theorem
If M is a derivation of a formula B, then there is a derivation µ(M) of [ [M] ] mr B from assumptions { xu mr C | uC ∈ FA(M) }.
Proof.
Induction on M.
SLIDE 31 Objections
◮ An idea of the algorithm must be present before a
constructive proof is carried out. (Correct, but sometimes it is hidden, as in the example to
- follow. Moreover, (a) programs are correct by construction,
(b) program development by proof transformation becomes possible).
◮ Complexity of extracted programs.
(If the means of proofs are restricted properly, one obtains exactly the polynomial time computable functions).
◮ Classical proofs usable?
(Yes, but one needs to have a closer look).
SLIDE 32
- 3. Normalization of lambda terms
Example: Tait’s proof of the existence of normal forms in typed λ-calculus. Ulrich Berger (1993) observed, that its computational content is “normalization by evaluation”. Recently Tait’s proof has been used as a case study for program extraction (Berger, Berghofer, Letouzey, S. 2005)
SLIDE 33
β-conversion, η-expansion
(λxr)s → rx[s] β-conversion
Example
β-conversion for type free λ-terms does not terminate: (λx. xx)(λx. xx) → (λx. xx)(λx. xx)
Definition
r is in β-normal form if no (inner) β-conversion is possible. r → λx. rx η-expansion (x / ∈ FV(r))
Definition
Let r be in β-normal form. r is in η-long normal form if no (inner) η-expansion is possible without creating a new β-convertible subterm.
SLIDE 34
Long normal forms
Terms in long normal form (i.e. normal w.r.t. β-conversion and η-expansion) are inductively defined by λxr | (xr1 . . . rn)ι.
Definition
Let r be in β-normal form. Let lnf(r) denote the result of maximally η-expanding r.
Example
Let x, y : ι and f : ι → ι and G : (ι → ι) → ι → ι. f →η λx.fx, G →η λf .Gf →η λf .G(λx.fx) →η λf λy.G(λx.fx)y = lnf(G).
SLIDE 35
Computability predicates, existence of normal forms
N(r, s) :⇔ r → · · · → t for some β-normal t, and lnf(t)=s A(r, s) :⇔ r = xr1 . . . rn and s = xs1 . . . sn, with N(ri, si) H(r, s) :⇔ r = (λx.t)u t and s = tx[u] t F(r, k) :⇔ every index of a variable free in r is < k Let FN(r) := ∀k.F(r, k) → ∃s N(r, s), similarly FA(r). Let C ι(r) := FNι(r), (has computational content!) C ρ⇒σ(r) := ∀ncs.C ρ(s) → C σ(rs). View C(ρ, r) := C ρ(r) as binary relation. Its defining formula depends on ρ, hence τ(C) = ω. (Cω := Cρ).
SLIDE 36 Main Lemmas, Normalization Theorem
Lemma 1. (a) C ρ(r) → FNρ(r), (b) FAρ(r) → C ρ(r) Lemma 2. C ρ(r′) → H(r, r′) → C ρ(r). Lemma 3. C
ρ(
s ) → C ρ(r[ s ]).
SLIDE 37
Extracted term: lemma 1
(Rec type=>(omega=>nat=>term)@@((nat=>term)=>omega)) (ModIota@([g3]OmegaInIota(cACL g3))) ([rho3,rho4,p5,p6] ([a7,n8] Abs rho3 (Sub(left p6(Mod a7(right p5([n9]Var n8)))(Succ n8)) ((Var map Seq 1 n8):+:(Var 0):)))@ ([g7] Hat rho3 rho4 ((cAC omega omega) ([a9] (cUNC omega) ((cUNC omega)((cIP omega) (right p6([n10]g7 n10(left p5 a9 n10)))))))))
SLIDE 38 Lemma 1 ∼ reify & reflect
Disregarding administrative functions and translating via rho4 rho5 left p5 right p5 left p6 right p6 ρ σ ↓ρ ↑ρ ↓σ ↑σ gives ↓ρ : Cω → (N → Λ) (“reify”) ↑ρ : (N → Λ) → Cω (“reflect”), with the recursion equations ↓ι(r):=r, ↑ι(r):=r, ↓ρ⇒σ(a)(k):=λxρ
k .↓σ
k ))
↑ρ⇒σ(r)(b):=↑σ(r ↓ρ(b)).
SLIDE 39
Extracted term: lemma 3
(Rec term=>list type=>list omega=>omega) ([n3,rhos4](ListRef omega)n3) ([r3,r4,q5,q6,rhos7,as8] Mod(q5 rhos7 as8)(q6 rhos7 as8)) ([rho3,r4,q5,rhos6,as7] Hat rho3(Typ(rho3::rhos6)r4) ((cAC omega omega) ([a9](cUNC omega)((cUNC omega)((cIP omega) (q5(rho3::rhos6)(a9::as7)))))))
SLIDE 40 Lemma 3 ∼ evaluation
For cLemmaThree(r, ρ, a) write [ [r] ](xρ0
0 →a0,...,x ρk−1 k−1 →ak−1)
with k := Lh( ρ )
[r] ](
x→ a). Disregarding administrative functions gives
[ [xi] ](
x→ a)
= ai [ [rs] ](
x→ a)
= [ [r] ](
x→ a)[
[s] ](
x→ a)
[ [λxρ
k r]
](
x→ a)(b) = [
[r] ](
x,xρ
k →
a,b)
SLIDE 41
NThm ∼ normalization by evaluation
Extracted term: NThm: [rhos0,r1] left(cLemmaOne(Typ rhos0 r1)) (cLemmaThree r1 rhos0(cSCrsSeq rhos0 0)) Lh rhos0
Theorem (Berger, Berghofer, Letouzey, S. 2005)
Let ↑ denote the variable assignment xρ
k → ↑ρ(xk). Then
cNThm( ρ, r) computes the long normal form of r as ↓ρ([ [r] ]↑)(k) with k = Lh( ρ ). This is “normalization by evaluation”.
SLIDE 42
- 4. Feasible computation with higher types
The extended Grzegorczyk hierarchy Fα, α < ε0. The power of higher types: iteration functionals Recursive definitions and higher types Functions definable in LT
SLIDE 43 The extended Grzegorczyk hierarchy Fα, α < ε0
(Grzegorczyk 1953, Robbin 1965, L¨
- b and Wainer 1970, S. 1971)
F0(x) := 2x, Fα+1(x) := F (x)
α (x)
(F (x)
α
x-th iterate of Fα), Fλ(x) := Fλ[x](x). Fundamental sequence λ[x] chosen canonically for limit numbers λ. Fε0 grows faster than all functions definable in Peano arithmetic.
SLIDE 44 The power of higher types: iteration functionals
Integer types: ρ0 := nat, ρn+1 := ρn → ρn. Define F n+1
α
F n+1 (xn, . . . , x0) :=
if n = 0 x(x0)
n
(xn−1, . . . , x0)
F n+1
α+1(xn, . . . , x0) := (F n+1 α
)(x0)(xn, . . . , x0), F n+1
λ
(xn, . . . , x0) := F n+1
λ[x0](xn, . . . , x0).
α
(F n
β ) = F n β+ωα if β + ωα = β # ωα (α, β < ε0).
Hence each Fα, α < ε0 can be built from the F n+1 (essentially iteration functionals) by application alone. Note: this representation of the Fα, α < ε0, does not need λ[x].
SLIDE 45 Recursive definitions and higher types
G¨
Uber eine bisher noch nicht ben¨ utzte Erweiterung des finiten Standpunkts”: finitely typed λ-terms with structural recursion. LT (Bellantoni, Niggl, S. 2000, 2002): restriction such that the definable functions are exactly the polynomial time computable
SLIDE 46 Types
ρ, σ ::= U | B | L(ρ) | ρ ⊸ σ | ρ → σ | ρ ⊗ σ | ρ × σ. The level of a type is defined by l(U) := l(B) := 0 l(L(ρ)) := l(ρ) l(ρ ⊸ σ) := l(ρ → σ) := max{l(σ), 1 + l(ρ)} l(ρ ⊗ σ) := l(ρ × σ) := max{l(ρ), l(σ)} Ground types are the types of level 0, and a higher type is any type
- f level at least 1. The →-free types are called linear types.
In particular, each ground type is linear.
SLIDE 47
Constants
ε : U tt, ff : B nilρ : L(ρ) consρ : ρ ⊸ L(ρ) ⊸ L(ρ) ifτ : B ⊸ τ × τ ⊸ τ (τ linear) Rρ
τ
: L(ρ) → (ρ → L(ρ) → τ ⊸ τ) → τ ⊸ τ (ρ ground, τ linear)
SLIDE 48
Constants (ctd.)
For linear ρ, σ, τ: ⊗+
ρσ
: ρ ⊸ σ ⊸ ρ ⊗ σ ⊗−
ρστ : ρ ⊗ σ ⊸ (ρ ⊸ σ ⊸ τ) ⊸ τ
×+
ρσ
: ρ ⊸ σ ⊸ ρ × σ (if ρ, σ ground) ×+
ρστ : (τ ⊸ ρ) ⊸ (τ ⊸ σ) ⊸ τ ⊸ ρ × σ
(if l(ρ × σ) > 0) fstρσ : ρ × σ ⊸ ρ sndρσ : ρ × σ ⊸ σ
SLIDE 49 LT-terms
are built from constants and typed variables xσ (incomplete) and ¯ xσ (complete) by introduction and elimination rules for the two type forms ρ ⊸ σ and ρ → σ, i.e. cρ (constant) | xρ (incomplete variable) | ¯ xρ (complete variable) | (λxρrσ)ρ⊸σ | (rρ⊸σsρ)σ with higher type incomplete variables in r, s distinct | (λ¯ xρrσ)ρ→σ | (rρ→σsρ)σ with s complete A term s is complete if all of its free variables are complete, else
- incomplete. A term is linear or ground according as its type is.
SLIDE 50
Conversions
(λxr)s → rx[s] β-conversion; similar for ¯ x ifτtts → fstττs ifτffs → sndττs Rρ
τnilρst
→ t Rρ
τ(consρrl)st
→ srl(Rρ
τlst)
⊗−
ρστ(⊗+ ρσrs)t
→ trs fstρσ(×+
ρσrs)
→ r sndρσ(×+
ρσrs)
→ s fstρσ(×+
ρστrst) → rt
sndρσ(×+
ρστrst) → st
Projections w.r.t ρ ⊗ σ can be defined easily: t0 := ⊗−
ρσρt(λxρλyσx)
and t1 := ⊗−
ρσσt(λxρλyσy).
SLIDE 51 Computation in LT
Lemma (Sharing Normalization)
Let t be an R-free term whose higher-type variables are
- incomplete. Then a parse dag for nf(t), of size at most |t|, can be
computed from t in time O(|t|2).
Proof.
Case fstρσ(×+
ρσrs) → r with ρ × σ a ground type.
❅ ❘
❅ ❘•
s
❅ ❘
❅ ❘
❅ ❘•
s
❇ ❇ ❇ ❇ ◆
The other cases are similar.
SLIDE 52
Numerals
Terms of the form consρrρ
1 (consρrρ 2 . . . (consρrρ n nilρ) . . .) are lists.
Abbreviations for N := L(U) and W := L(B): 0 := nilU S := λlNconsUεl 1 := nilB S0 := λlWconsBffl S1 := λlWconsBttl Particular lists are S(. . . (S0) . . . ) and Si1(. . . (Sin1) . . . ). The former are called unary numerals, and the latter binary numerals.
SLIDE 53
Polynomials
⊕: W → W ⊸ W. x ⊕ y concatenates |x| bits onto y. 1 ⊕ y = S0y (Six) ⊕ y = S0(x ⊕ y) ¯ x ⊕ y := RW⊸W¯ x(λ¯ zλ¯ lλpW⊸Wλy.S0(py))S0. ⊙: W → W → W. x ⊙ y has output length |x| · |y|. x ⊙ 1 = x x ⊙ (Siy) = x ⊕ (x ⊙ y) ¯ x ⊙ ¯ y := RW¯ y(λ¯ zλ¯ lλpW.¯ x ⊕ p)¯ x.
SLIDE 54
Functions definable in LT
Theorem (Normalization)
Let r be a closed LT-term of type W ։ . . . W ։ W (։∈ {→, ⊸}). Then r denotes a polytime function.
Proof.
Uses a model of computation via parse dags; ground type terms can be shared. Details in Bellantoni & S. 2002.
Theorem (Sufficiency)
Let f be a polytime function. Then f is denoted by a closed LT-term t.
Proof.
Induction on the definition of f (x1, . . . , xk; y1, . . . , yl) in Bellantoni and Cook’s B, associating to f a closed term tf of type W(k) → W(l) ⊸ W, such that tf denotes f .
SLIDE 55 Conclusion, future work
◮ Program extraction from proofs not only gives certified code
(“no logical errors”), but (in case of clever proofs) can even give unexpected algorithms.
◮ Using the representation of proofs by lambda-terms
(“Curry-Howard correspondence”) one can solve (S. 2005) Heyting Arithmetic G¨
= ? LT Usability of this “linear arithmetic” is still to be explored.
◮ Analyzing computability in finite types over the Scott-Ershov
partial continuous functionals can be indispensible for an appropriate formulation. From a program extraction point of view, type theory with approximations needs to be developed.