Learning Lean 4 as a programming language 5 – A different style of proofs
I haven’t posted a new article since my previous post on Fibonacci numbers. I found the idea of proving the correctness of my function highly tantalizing, but also more cumbersome than necessary. So I started working on some code which would help me prove statements about functions. This has absorbed the entire time I spend with Lean since then and I haven’t solved any additional Euler problems yet. This post describes the starting point of my journey.
The standard style which I find in Mathlib is to write functions with functional code, just like I would in Haskell, and have separate theorems about properties of these functions with proofs. This is the style which I also used for the Fibonacci numbers in my previous post. Since then I have discovered a different style, which uses Subtype
to combine functions and theorems into one, and is based on a brief section on subtypes in Functional Programming in Lean.
Proofs based on Subtype
When the idea for this different style crystalized I was looking at Nat.minFac
from Mathlib. So I decided to test the viability of this idea by re-implementing Nat.minFac
in this style, and I will also use this example to explain what I mean. I don’t believe this idea to be original, others probably had a similar ideas at some point and maybe even abandoned it for good reasons, but I haven’t found a comprehensive description yet, nor any conclusive reason why it wouldn’t work for my use case.
In Mathlib there is a function minFac
and a theorem minFac_has_prop
with these type signatures:
def minFac (n : ℕ) : ℕ := sorry
theorem minFac_has_prop {n : ℕ} (n1 : n ≠ 1) : minFacProp n (minFac n) := sorry
Both of these can be combined into a single function using Subtype
in its type signature:
def minFac' (n : {n : ℕ // 2 ≤ n}) : {m : ℕ // minFacProp' n m} := sorry
This signature essentially expresses the same statement as minFac_has_prop
, but it’s body contains both code for computing the result, as well as the proof for this statement. There is one significant difference here, as I restrict the domain to n ≥ 2
. I will discuss this in detail below, for now here is my implementation of minFac'
(on live.lean-lang.org)
import Mathlib
namespace Nat
theorem eq_or_succ_le_of_le {a b : ℕ} (h : a ≤ b) : a = b ∨ Nat.succ a ≤ b := by
cases Nat.eq_or_lt_of_le h with
| inl h_eq => left; assumption
| inr h_lt => right; assumption
structure minFacProp' (n : ℕ) (result : ℕ) : Prop where
intro ::
ge_two : 2 ≤ result
fac : result ∣ n
min : ∀ m : ℕ, 2 ≤ m → m < result → ¬ m ∣ n
def minFacAux' (n : {n : ℕ // 2 ≤ n ∧ ¬ 2 ∣ n}) :
{k : ℕ // (3 ≤ k ∧ 2 ∣ k.succ) ∧ ∀ m : ℕ, 2 ≤ m → m < k → ¬ m ∣ n} →
{m : ℕ // minFacProp' n m}
| k =>
if h : ¬ k.val * k.val ≤ n.val then
Subtype.mk n <| by
apply minFacProp'.intro
· exact n.property.left
· simp[dvd_refl]
· have n_prime : Prime n := by
have sqrt_lt_k : sqrt n.val < k.val := not_le.mp <| (Iff.not Nat.le_sqrt).mpr h
apply (and_imp.mp Nat.prime_def_le_sqrt.mpr) n.property.left
intros m m_ge_2 m_le_sqrt
exact k.property.right m m_ge_2 <| lt_of_le_of_lt m_le_sqrt sqrt_lt_k
exact Nat.prime_def_lt'.mp n_prime |>.right
else
if k_dvd_n : k.val ∣ n then
Subtype.mk k <| by
apply minFacProp'.intro
· have : 2 ≤ 3 := by decide
have := le_trans this k.property.left.left
assumption
· exact k_dvd_n
· exact k.property.right
else
-- used for termination
have : Nat.sqrt n - k < Nat.sqrt n + 2 - k := by
have sqrt_lt_sqrt_add2 : sqrt ↑n < sqrt ↑n + 2 := by simp only [lt_add_iff_pos_right]
have k_le_sqrt := Nat.le_sqrt.mpr <| of_not_not h
exact (tsub_lt_tsub_iff_right k_le_sqrt).mpr sqrt_lt_sqrt_add2
let k' := Subtype.mk (k.val + 2) <| by
apply And.intro
· apply And.intro
· have k_le_k_plus_2 : k.val ≤ k.val + 2 := by simp only [le_add_iff_nonneg_right]
exact (le_trans k.property.left.left k_le_k_plus_2)
· exact Nat.dvd_add_self_right.mpr k.property.left.right
· intros m m_ge_2 m_lt_k_plus_2
have := k.property.right m m_ge_2
by_cases m < k
· exact k.property.right m m_ge_2 h
· cases eq_or_succ_le_of_le <| le_of_not_gt h with
| inl k_eq_m =>
rw [k_eq_m] at k_dvd_n
exact k_dvd_n
| inr succ_k_le_m =>
cases eq_or_succ_le_of_le succ_k_le_m with
| inl succ_k_eq_m =>
rw [←succ_k_eq_m]
have := n.property.right
by_contra m_dvd_n
have := Nat.dvd_trans k.property.left.right m_dvd_n
contradiction
| inr succ_succ_k_le_m =>
have := lt_of_le_of_lt succ_succ_k_le_m m_lt_k_plus_2
simp only [lt_self_iff_false] at this
minFacAux' n k'
termination_by _ n k => sqrt n + 2 - k
def minFac' (n : {n : ℕ // 2 ≤ n}) : {m : ℕ // minFacProp' n m} :=
if _2_dvd_n : 2 ∣ n.val then
Subtype.mk 2 <| by
apply minFacProp'.intro
· decide
· exact _2_dvd_n
· intros m _2_le_m m_lt_2
have := lt_of_le_of_lt _2_le_m m_lt_2
contradiction
else
let n' := Subtype.mk n.val <| And.intro n.property _2_dvd_n
let k' := Subtype.mk 3 <| by
simp only [Nat.isUnit_iff, true_and]
intros m _2_le_m m_lt_3
have m_eq_2 := eq_of_le_of_lt_succ _2_le_m m_lt_3
rw [m_eq_2]
exact _2_dvd_n
minFacAux' n' k'
end Nat
The key feature of this code is that whenever I compute a result I use Subtype.mk
to also supply a proof that the value has some desired property. Most importantly, I use this pattern for all result values. Conversely, I can use .property
to access the defining property of such a value in the proof parts.
Why I like this style
One thing I have struggled with when trying to proof theorems about functions, is that the proof needs to replicate the structure of the function. This is also apparent in the original proof of minFacAux_has_prop
. Since minFacAux
uses if h : n < k * k then n
, minFacAux_has_prop
has to use by_cases h : n < k * k
, and later if k ∣ n then k
necessitates by_cases dk : k ∣ n
in the proof. In the Subtype
based style there is no duplication of structure, the computation of a value and the proof of it’s property automatically happens in the same context. Similarly, because minFacAux
recursively calls minFacAux n (k + 2)
, minFac_has_prop
has to invoke minFacAux_has_prop
, with some rather complex arguments, to proof that the result has the desired property, while with the Subtype
style the proof of this property is automatically attached to the value.
Contract programming
Functions using Subtype
for their arguments and return value remind me of contract programming. The Subtype
predicates of arguments represent preconditions, while the Subtype
predicate of the return type represents postconditions. From this perspective Lean 4 offers the ability to use contract programming with compile time checking of contracts, which makes Lean 4 the most exciting language I have ever looked at.
Additional burden
The section on subtypes in Functional Programming in Lean, which this is all based on includes a warning:
Subtypes are a two-edged sword. They allow efficient representation of validation rules, but they transfer the burden of maintaining these rules to the users of the library, who have to prove that they are not violating important invariants. Generally, it’s a good idea to use them internally to a library, providing an API to users that automatically ensures that all invariants are satisfied, with any necessary proofs being internal to the library.
Since I am promoting the the use of Subtype
I should address this warning. First, this doesn’t apply to the return type, the maximal burden a user might incur from a return type using Subtype
, is having to type .val
in case the values isn’t automatically coerced to the underlying type. Second, there exists a principal in contract programming, that the preconditions should be as weak as possible, which applies here as well. If at all reasonable, arguments shouldn’t use Subtype
, and if they do these should be simple properties which are easy to proof. Finally, if preconditions cannot be reasonably weakened any further, putting that burden on the user when invoking the function is justified, as I will show in the next section.
Restricting the domain (2 ≤ n
)
One big difference with my reimplementation is that I restrict the domain of minFac'
to 2 ≤ n
, while the standard implementation minFac
in Lean 4 accepts both n = 0
and n = 1
. This is an additional precondition, which the user of minFac'
has to proof before calling this function. Before I justify this choice, let me first address why Lean 4 usually doesn’t make such restrictions. I believe the blog post “Division by zero in type theory: a FAQ” by Kevin Buzzard perfectly explains why functions in Lean 4 have a very broad domain, even including nonsensical input values, like the prime example 1/0 = 0
. Trying to summarize it in one sentence, I would say: it doesn’t matter that such functions return some value for nonsensical inputs, as long as these inputs don’t show up in the theorems. This is perfectly valid and justified, as long as you are only interested in the theorems. But, if like I, you are interested in Lean as a programming language, this can be problematic. If I write a/b
somewhere in my program, I probably reason, that b
can never be 0
. If my reasoning is flawed, and b
ends up being 0
after all, I would rather have the program throw an error, so I can notice and fix it, rather than silently produce some nonsensical result. The solution within Lean 4 is of course to proof the reasoning to be correct, so I can be sure b
never is 0
. In the standard style this is done in a separate theorem, but as explained above this requires the proof of that theorem to replicate the structure of the function, and I find it much easier to proof the correctness of the code when the function is invoked. If I want to use such a function, but find that proofing its correct use isn’t worth the effort, I can always use sorry
.
Summary and next steps
I think the Subtype
style is very promising for implementing provable contract programming in Lean 4. It’s a possible to use this style to combine code with proofs, and since it doesn’t require to replicate the structure of the function within the proof of the theorem, I find it easier. However there is very little infrastructure in place to make working in this style easier. For example, while Subtype
itself isn’t a monad or even a functor, there exists a related structure which is both a functor and monad, which isn’t defined in Lean 4 yet, and which might make working in this style even easier. So I’ve been investigating the possibility of this and related monads to make proving properties of functions easier.