Testing vs proof in software development

NOTE HXA7241 2013-08-04T09:39Z

Is proof better than testing? On examination it seems they are not quite comparable, and the key issues are other things: inference and judgement. Development needs both. Perhaps ideally fused into ‘inferential construction’.

A dream of mine as a software developer is for someone to demand “How do you know it works, did you test it??” and reply “No, I proved it.”

https://twitter.com/pelotom/status/362386563084189698

This does seem desirable. Or is it? Does it really make sense? What do testing and proof do?

First of all, specification problems still exist – you cannot prove you want the right thing. You want to calculate area, but you code x plus y. One might respond that of course proof depends on trustworthy available specifications, and so do tests. But ‘test’ can reasonably have a larger meaning. Tests can check by human judgement: you can directly show the customer and ask them if it is right. That extra meaning of test is interesting.

The concern is that tests/checks by human judgement are not only needed at the very edge – checking the final product with the customer. Is not human judgement a deep and pervasive ingredient throughout software? Is not each component (hence its developer) the ‘customer’ to another? Is judgement not needed in making every line of code? So how valuable is proof if it is surrounded, and perhaps undermined, everywhere by the uncertainty of human judgement? You can prove some small step, but that only stands upon the shaky ground of some other test by judgement. If each part of programming is not shaped purely by logical decisions, logical approaches can never wholly prove any program.

One might then consider that proofs could also integrate reference to human judgement. This is the way mathematics has proceeded. You have a set of strict rules and you check – by human judgement – if a proof complies. But then what does proof really mean, because it must now seem rather weaker. ‘Proved’ in the original quote is probably read as utterly authoritative, as strong as the force of logic. But if proof is so intermixed throughout with human judgement that can hardly be true. A proof no longer ‘proves’ something correct.

Yet that does not destroy everything. Proof still does something. It is like the Wittgenstein rule-following question. Just because a rule does not secure every detail does not mean it gives you nothing. Proof is still better than testing, because it leverages abstraction. Proof lets you take certain kinds of big strong steps that testing cannot. Our proofs are as strong as our abstractions (although our abstractions must be compromised with other requirements too …).

The key elements here are not proof and test, but inference and judgement. Inference: a special disciplined kind of movement, taking structured logical steps; judgement: appealing to the ultimate authority of human intent, following what is actually wanted. Both are essential in software development – they must always be there. Automatic complete apodictic ‘proof’ of software correctness is as impossible as automatically making software – at least, as long as software is a deliberate, willful, activity.

So testing and proof are really about different things – or at least they are best used as such. Testing is comparison, proof is inference. One is not better than the other: we need both testing and proof. The virtue of proof is the power of inference, its range and solidity. Testing also has a virtue, but it ought to be more closely confined to it. Ideally, the only testing we should do is comparison with human intent, which only testing can do, and everything else should be proved.

———

But why build something and then prove it? The method of construction should be the proof. Perhaps ideally we want that kind of ‘inferential-construction’: every step of building or modification rigidly logically ensures certain properties, but also moves in a direction chosen by judgement. That seems the minimal general form of combining inference and judgement in software development. Each action in the process of building combines attention to both needs.

———

Related: