Skip navigation

I would like to collect a list of some examples where non-constructive reasoning is either essential or makes something significantly more convenient.  I am still digesting work of Russell O’Connor on “Classical Mathematics for a Constructive World” — Russell has many interesting examples there, including using classical reasoning for the constructive reals (for which basic relations like equality are undecidable).  One is for verification of hybrid systems (that paper will be printing here shortly), which is certainly an eminently practical and computational application.  The others are more mathematical in nature, as the title of the paper suggests.  This leads me to refine my quest a little: I’d like to see examples of computational problems where non-constructive reasoning is either essential or very useful.  Here are a few examples I know of so far.  I’d be very grateful for any others you know of or think of that you’d be willing to share.

Termination case.  In the Trellys project (particularly “Iowa Trellys”), we have made use of the ability to case split on whether or not a term terminates under the call-by-value operational semantics of the language.  One great use for this feature is to provide simpler theorems about functions which have not been proven total (either because they are truly partial, or because we just decide not to invest our verification effort to do so).  For example, in Iowa Trellys, quantifiers range over terms of the stated type, not values.  To prove a theorem like

\forall x.\, x+0 = x

(assuming plus recurses on its first argument, so the theorem is not immediate) it is thus very handy to be able to case split on whether or not x has a value.  If it does not, then both sides of the equation diverge and are hence equal (since we equate all diverging terms).  If x does have a value, then we can do induction on it to prove the theorem in a standard way.

Similar examples arise even if the quantifiers of the language range only over values.  Suppose we have defined a tail-recursive reverse-map function (our implementation’s notation is not this nice :-( ):


rev_map f [] l' = l'
rev_map f (x:l) l' = rev_map f l (f x : l')

Suppose we want to prove the following basic theorem about this function:


forall f, l1, l2, l' . rev_map f (l1 ++ l2) l' = rev_map f l2 (rev_map f l1 l')

The proof is by induction on l1.  In the inductive case, where l1 is x:l1′ (say), we need to instantiate the induction hypothesis with f, l1′, l2, and f x: l’.  But the latter is not known to terminate, and in general might not, since f x might very well diverge.  Of course we could choose to prove the theorem only for total functions f, but that is strictly weaker than what we could hope to achieve: the theorem is also true if f is partial.  For if f x diverges, then so will both sides of the above equation.  With termination case, one can prove desirable theorems about higher-order functions in a general-recursive setting, where a purely constructive approach would only support strictly weaker theorems.

Computational induction. For quite some time in our work on Iowa Trellys, we have been vexed by the lack (in our current system) of a principle of induction on the length of a terminating computation.  For example, suppose you have defined an interpreter for SK combinators (where I am writing embedded combinator terms like App(S,K) as just S K):


interp S = S
interp (S x) = S (interp x)
interp (S x y) = S (interp x) (interp y)
interp K = K
interp (K x) = K (interp x)
interp (x y) = app (interp x) (interp y)
app S x = S x
app (S x) y = S x y
app (S x y) z = interp (x z (y z))
app K x = x
app (K x) y = x

Suppose you wish to prove that it is idempotent:

\forall x.\ \textnormal{interp}\ (\textnormal{interp}\ x) = \textnormal{interp}\ x

A natural way to prove this is first to case split on whether or not interp x terminates.  If it does not, then both sides of the above equation diverge under call-by-value semantics, and so we equate them.  If interp x does terminate, then we would like to reason by induction on some measure of the terminating computation; here I will use nesting depth of recursive calls.  For example, if we use the second equation above for interp, by the IH we know that (interp (interp x)) equals interp x, which shows that (interp (interp (S x)) equals interp (S x).  And in the case for the third equation for app, by the IH we know that (interp (interp (x z (y z)))) equals interp (x z (y z), and hence (since interp (x z (y z)) is the value returned by interp in the situation that led to this call to app), we also know interp is idempotent on the original term.

Here is a way (that I just realized in the middle of the night yesterday) that we can use classical reasoning to derive a principle of computational induction for functions like interp above.  First, we have to define a counter-indexed version of the function.  This is in addition to the regular version of the function without counters.  For interp, a counter-indexed version could be (I will just call both this and the original function “interp”):


interp n S = S
interp (n+1) (S x) = S (interp n x)
interp (n+1) (S x y) = S (interp n x) (interp n y)
interp n K = K
interp (n+1) (K x) = K (interp n x)
interp (n+1) (x y) = app n (interp n x) (interp n y)
interp 0 x = abort
app n S x = S x
app n (S x) y = S x y
app n (S x y) z = interp n (x z (y z))
app n K x = x
app n (K x) y = x

Now our goal is to prove \exists n.\textnormal{interp}\ n\ x\ \downarrow, from our assumption \textnormal{interp}\ x\ \downarrow.  Here I am writing t\downarrow to mean that t terminates.  Let us case split on whether or not \exists n.\textnormal{interp}\ n\ x\ \downarrow.  Such a case split is non-constructive, since if it were constructive we could effectively decide halting for SK combinators, which is impossible.  In the case where that formula holds, we are obviously immediately done.  So suppose it does not hold.  Then (intuitionistically) we have \forall n.\neg\textnormal{interp}\ n\ x\ \downarrow.  Let us imagine that we have a coinductive predicate t\uparrow for divergence, and suppose we know (presumably by an axiom) that t\uparrow\Rightarrow \neg t\downarrow.  Let us now try to prove the following formula:

\forall x.\ (\forall n.\neg\textnormal{interp}\ n\ x\ \downarrow)\ \Rightarrow\ \textnormal{interp}\ x\ \uparrow

The proof is by coinduction: we must produce a proof of \textnormal{interp}\ x\ \uparrow, where we use the co-IH only where it is guaranteed to be guarded by constructors for our coinductive type of divergence in the final return value of the co-inductive proof.  How does this proof work?  We first case split on x, to enable some partial evaluation of interp x.  For example, suppose x is (S x’), and so interp x = S (interp x’).  We can prove \forall n.\neg\textnormal{interp}\ n\ x' by contradiction: if there exists n where interp n x’ terminates, then interp (n+1) x terminates, contradicting our assumption \forall n\neg\textnormal{interp}\ n\ x\ \downarrow.  So we can use our co-IH to obtain a proof that interp x’ diverges, which we can then wrap in a constructor for the divergence type to show \textnormal{interp}\ x\ \uparrow in a guarded way.  The other cases proceed similarly (I believe: I did not work through them all for interp).  Since we have derived \textnormal{interp}\ x\uparrow from our assumption \forall n.\neg\textnormal{interp}\ n\ x\downarrow, and since \textnormal{interp}\ x\ \uparrow contradicts (by the principle stated above) \textnormal{interp}\ x\ \downarrow, which we are assuming, we have derived a contradiction and the proof is completed.

I’d be delighted to hear of other uses of classical reasoning for computational problems you know of!

I am driving over this morning to the Computer and Information Sciences department at St. Ambrose University in nearby Davenport, Iowa, to give a brief talk (slides here) and talk to students and faculty about graduate study at The University of Iowa.

[Updated after my visit:] I had a great trip over to St. Ambrose CS.  I was very warmly greeted by professors Kevin Lillis, Mark McGinn, and Gary Monnard, as well as a very nice turnout of undergraduates (30?) interested in learning a bit more about University of Iowa graduate programs in CS.  I also did tell them a bit about some of our group’s research.  My impression was that many students already have great jobs lined up for after college, and the costs of further study are perceived as a major impediment.  Now we have a great situation here at Iowa, where we have enough external grant funding and internal funding through fellowships and teaching assistantships that certainly for doctoral study, and even for Master’s study for strong students, graduate study will not cost anything for the student, and they’ll even get a stipend.  This is standard, of course, at most research departments, but I think the word is not out to undergraduates about this.  It may take some time to change people’s knowledge of the graduate-education system.

Anyhow, I had a great time talking with Kevin, Mark, and Gary, and learning more about St. Ambrose.   The school has around 3300 (full-time equivalent) students, of whom  around 2600 are undergraduates.  I got the impression that most are from Davenport or nearby, with a significant percentage of nontraditional (age) students and commuters who don’t live on campus.  The CS department has something like 80 students in a number of different majors the department offers.  The campus itself is very nice: it occupies something like a 4-block by 4-block square in Davenport.  The building we were in for my talk, the Rogalski Center, is very nice and quite new.  Kevin walked me around campus, including the lovely chapel; there was a beautiful Madonna and Child by Fr. Catich, founder of the art department at St. Ambrose and known for beautiful art works on slate of religious scenes (like this).

Our department has sent faculty out to I think it is 8 or 9 colleges in the region this fall, to try to increase our domestic applicant pool for graduate study.  I very much hope this succeeds — that’s the kind of extra work on grad admissions we want to have!

I have recently become fascinated, not to say obsessively fixated, on what some call computational classical type theory.  This is the kind of type theory one gets by applying the Curry-Howard isomorphism to proof systems for classical logic.  There are many variants, and Harley Eades, who introduced me to this topic, has written a lot about them on his blog.  These systems extend lambda calculus with some control operators (like try and catch for exceptions, or generalizations of these).  For example, a proof of A \vee \neg A works by providing a function from A to \perp, where if that function is ever called with a proof of A, we go back in time (by raising an exception) to the point where the proof of A \vee \neg A was consulted, and at that point return that proof of A, instead of the function from A to \perp.  Sound evil?  It is.  In fact, in this post I want to highlight just how bad the situation is.  But this is not a rant: I am firmly convinced that classical reasoning is needed for reasoning about general-recursive functions (think case-splitting on whether or not a given general-recursive term terminates).  But we are not going to be able to make use of these powerful and exciting ideas if we cannot tame the non-canonicity of these theories, which I will discuss next.

When we program in a pure functional language, say, we usually intend that at some point our program will produce some observable effect.  We are not writing programs just for the sake of knowing they are running.  We want them to compute answers to some computational question.  So we need to take some kind of observation of the results (assuming convergence) of our program.

Now, we know that in classical logic the so-called disjunction property fails.  That is, we can prove, from no assumptions, a formula A \vee B without in general having a proof of A independent of B, or else of B independent of A.  For a “classic” example, we can prove A \vee \neg A without thereby having a proof of A by itself or else \neg A.  So we cannot observe a canonical result of type A \vee \neg A in general, where a canonical result of a  type A \vee B is either \textit{in}_1\, v_1 or \textit{in}_2\, v_2, with v_1 a canonical result of type A and v_2 a canonical result of type B.

What is the big deal with that?  Probably we were not expecting to observe results at \vee-type.  But of course, we could compute some value at a type we do want to observe, like \textit{nat} (for natural numbers), based on a case split on a non-canonical result of type A\vee \neg A.  These type theories for classical logic are quite amazing: they will carefully track and propagate that non-canonicity throughout computations.  So if we perform such a case-split in computing a \textit{nat}, we would expect to end up with another non-canonical result, which would then not be suitable for observation.

Well darn, you might say.  Sometimes we might try to compute a number and end up with a non-canonical result which we cannot observe.  Perhaps we could just report an error to the user in that case: “non-canonical result encountered, cannot observe”.  That would be unworkably bad, in my opinion.  But it gets worse.

It can actually happen (I believe: any experts reading please check me on this) that we can compute an incorrect canonical result, based on case-splitting with a non-canonical term of type A\vee\neg A.  This is really shocking, and renders the idea of computational classical type theory unusable in practice without major modifications (some of which have been undertaken by Herbelin, Krebbers, and others in work to preserve canonicity at various types, while still allowing classical reasoning).  Here is a sketched example.

I am assuming you are familiar with the non-constructive proof of the fact that there are two irrationals a and b such that a^b is rational (you can read this proof on Wikipedia here).  This proof does a case-split on whether \sqrt{2}^{\sqrt{2}} is rational or not.  Let us recast this proof ever so slightly, so that the case-split is on whether that quantity is irrational or not.  As I described at the start of this post, the proof of A\vee\neg A in classical logic returns a proof of \neg A (the second branch of the disjunction) which, if it is ever used to derive a contradiction with a proof of A, will throw that proof of A back to the original point the proof of A \vee\neg A was used, to serve instead as a proof of the first branch of the disjunction.  Ok.  But what this means is that when you case-split on this proof of A \vee\neg A, that proof essentially tells you that the second disjunct is the true one.  So if we case split on whether or not \sqrt{2}^{\sqrt{2}} is irrational, computationally we will enter the second branch of our proof.  And in that branch, we know that \sqrt{2}^{\sqrt{2}} is rational, and that is what (computationally) we will return from our proof, together with the proof that it is rational.

There is only one problem with this.  As stated on the above linked Wikipedia page, it turns out that \sqrt{2}^{\sqrt{2}} is actually irrational.  Now, the proof of rationality that we returned along with this value is such that if we ever contradict it using a proof that the quantity in question is irrational, we will go back in time to our original case-split, and produce instead the answer that it is \sqrt{2}^{\sqrt{2}} and \sqrt{2} that are the irrationals a and b such that a^b is rational.  So again, the logic very carefully tracks the non-canonicity of our proof, and will go back and repair the situation if choosing the second branch of A\vee\neg A turns out to have been the wrong choice.

But this is small consolation to the programmer who is given \sqrt{2} and \sqrt{2} together with the non-canonical proof that \sqrt{2}^{\sqrt{2}} is rational, and takes an observation of those values (prints them out, say).  As soon as we need to observe our answers, we depart from the magical world of classical logic where everything is revisable, and commit ourselves to a particular value.  It is well known that there may be no value to commit to, due to non-canonicity, when taking such a measurement.  What the above example shows (assuming I am right about this) is that one can actually commit to the wrong value.  This is worse than aborting with the “non-canonical result” error message.  The source of the problem is that we have a non-canonical constraint on our value which needs to be maintained, but taking an observation severs the value from that constraint.

To conclude: we absolutely have to be able to carve out a safe fragment of a computational classical type theory, where what you see (via typing) is truly what you get, in order for such systems to be even remotely usable in practice.  Some attempts (briefly mentioned above) are being made, but there is no fully satisfactory solution yet available, in my opinion.  Such a solution would use the type system to track canonicity, so one would know whether it is legal to observe a particular value or not.  Values that cannot be observed can still be used for formal reasoning.  But as we have seen, they lead to disaster if allowed out of the mathematics school room and onto the computational playground.

I am driving what Google Maps tells me are the three hours down to Kirksville, Missouri, from Iowa City, to visit the Math & CS department at Truman State.  I am hoping to meet some more fantastic potential graduate students like Truman alum Ben Delaware (whose paper was just accepted to POPL 2013 — congratulations, Ben!).

The slides for my talk on “Programs, Proofs, and Classical Logic” are here.  I am planning to update this post with a little more about the trip once I get back.

[Updated the day after the trip:]

It was a very interesting trip down to Kirksville, on some quiet rural highways through eastern Iowa.  The Truman State campus is very nice, quite charming actually.  I met up with Jon Beck, a CS faculty member who coincidentally turns out to have ties to Iowa (David Eichmann, of U. Iowa Biomedical Informatics, was his advisor, at another institution).  We had a very engaging conversation about CS, CS education, and CS at Truman State.  The school transformed itself a couple decades back from what Jon described as a “sleepy teachers’ college” (presumably that phrase associates to the right), into a highly selective public liberal arts college.  They actually got the Missouri legislature to agree to let them be as selective in their admissions as they want.  This makes a huge difference, since it means they can focus on admitting just people who are adequately prepared and seriously motivated for college study.  The same cannot be said about U. Iowa, for example.  I also got a little tour of the campus, which again is really idyllic.  I like U. Iowa, but I was a bit jealous of the campus there, truly.

I got to talk informally with a few other folks, and then gave my talk and made my grad-school pitch in the afternoon.  The talk was really well attended — solely by undergrads, and no one would confess to any bribery to get them there.  I was impressed.  I got some great questions about both the research I was discussing, and about grad school in general.  The students seemed sharp and engaged.  I am certainly hoping that we get a couple applications from Truman Math & CS this season!

In a couple weeks I am going to do another of these, at St. Ambrose in Davenport, Iowa.  Other colleagues from U. Iowa CS are making similar trips to nearby colleges.  We are hoping these will bear some fruit in more domestic applications for grad school here.

Last week, a bunch of us from U. Iowa (8 in total, in two rental cars) drove down to Kansas U. for Midwest Verification Days (MVD), hosted by Perry Alexander and Andy Gill, and very efficiently supported by the wonderful staff there at the Kansas Information and Telecommunication Technology Center (ITTC).  As has become the norm for MVD, it was a lively meeting, with around 60 participants, and a very high level of talks.  Most of the talks were from graduate students, though undergraduate Angello Astorga (who worked with me this past summer) came down from Ohio State to talk about his summer work, and there were also great industry talks by Lee Pike of Galois and Ray Richards and David Hardin of Rockwell Collins.  You can see the full schedule on the MVD ’12 page.  Galois and Rockwell were both in recruiting mode, as well as telling us about the amazingly cool applications of formal methods that their companies are pursuing.  I would love to see MVD become a recruiting event, as well as a research event.  We are clearly on our way with the presence of these two important formal-methods teams.

As I said, the student talks were really great.  I personally always love hearing the talks from the U. Colorado Boulder group.  They are always doing some cutting-edge applications of static analysis or formal methods to current development problems.  For example, Arlen Cox and Devin Coughlin were talking about static analysis for Javascript — very relevant for web programming, of course, and posing nasty analysis problems as the language and idioms for it are very dynamic (e.g., method calls by reflection) and hard to deal with statically.  Sam Blackshear  was talking about live-leak detection, which I always thought was an intriguing problem.  Really, all the talks were great, and I don’t have time to tell you about all of them.  So I have to just mention a couple more.  I really liked a talk by Caleb Eggensperger of U. Oklahoma on ProofPad, a nice graphical interface for ACL2.  The talk by Dongjiang You of U. Minnesota Software Engineering Center on Observable MC/DC (a code coverage method I’d heard his advisor Mike Whalen talk about before), was also very interesting to me: the idea I took away (and I’m not too familiar with code-coverage techniques) was that we could execute a piece of code with concrete inputs, while still tracking which internal variables contribute to the output.  This could help us see which internal parts of a system are being exercised by a concrete test case.  That’s a cool mix of static and dynamic semantics (since we have a dynamic execution that tracks some static information, namely which variables contributed to a concrete value), which I had not considered before.

Anyway, it was a really fun event, and I’m very pleased that MVD is alive and well, following what is now the 4th edition (MVD ’09 and ’10 were the first ones, at Iowa; and then MVD ’11 was at Minnesota).  MVD ’13 will be at U. Illinois Chicago, hosted by Lenore Zuck, who was also there at MVD ’12, so that is going to be very exciting as well.  Our definition of the Midwest is pretty broad, and Midwesterners are welcoming people by nature: so swing by for the  next one if you can.

Follow

Get every new post delivered to your Inbox.