Testing is widespread in programming, but correctness in programming is mostly assured by construction. Testing prods a "black box"; this is appealing in that libraries are presented to users as black boxes, but, as we shall see, it is limited in what it can accomplish for deep logical reasons.

Girard distinguishes the expansive from the recessive; \( \exists x. P(x) \) is expansive (it assures us something exists) while \( \forall x. P(x) \) is recessive (it constrains what is out there). Gödel says: never the two shall meet.

Suppose we have some property \( P \) that we expect our procedure to satisfy, and suppose that we have written our procedure incorrectly, so that \( \neg\forall x. P(x) \), hence \(\exists x.\neg P(x)\). Then we want the property testing library to give a counterexample, i.e. it should supply a witness \(w\) such that \(\neg P(w)\).

There is no logical reason that this should be possible. This does not mean that property testing is hopeless—procedures that programmers write in practice are far from "functions" in set theory. But the study of such functions in logic remains arcane. And programming in the mainstream is even worse, "unit tests" abound while authors (who use tests to avoid justifying their code in the first place) cannot justify the assurances they putatively provide.

The Mathematical Perspective

Proof assistants do not address the totality of mathematics; mathematicians collect counterexamples.

Shrinking as introduced by QuickCheck was practically motivated. Mathematicians already search for simpler examples, perhaps study of such could benefit both fields.