Skip navigation

Category Archives: Logic

This post is about double negation in constructive type theory.  Suppose you know that \neg(\neg A) is true; i.e., (A \to \perp)\to \perp (writing \perp for false).  In classical logic, you then know that A is true.  But of course, you are allowed no such conclusion in intuitionistic logic.  So far, so familiar.  But let us look at this now from the perspective of realizability.  Realizability semantics gives the meaning of a type as a set of some kind of realizers.  Let us consider untyped lambda calculus terms (or combinators) as realizers for purposes of this discussion.  Then the crucial clauses of the semantics for our purposes are

  • t \in [\negthinspace[ A \to B ]\negthinspace ]\ \Leftrightarrow\ \forall t' \in[\negthinspace[A]\negthinspace].\ t\ t'\in[\negthinspace[B]\negthinspace], and
  • \neg t \in [\negthinspace[\perp]\negthinspace]

Let us work through, then, when t\in[\negthinspace[(A\to\perp)\to\perp]\negthinspace] according to this definition:

\begin{array}{ll} t\in[\negthinspace[(A\to\perp)\to\perp]\negthinspace] & \Leftrightarrow \\ \forall t'\in[\negthinspace[A\to\perp]\negthinspace].\ t\ t'\in[\negthinspace[\perp]] & \Leftrightarrow \\ \forall t'.\neg(t'\in[\negthinspace[ A \to \perp]\negthinspace]) & \Leftrightarrow \\ \neg(t'\in[\negthinspace[A\to\perp]\negthinspace]) & \Leftrightarrow \\ \neg(\forall t''\in[\negthinspace[A]\negthinspace].\ t'\ t''\in[\negthinspace[\perp]\negthinspace]) & \Leftrightarrow \\ \neg(\forall t''.\neg t''\in[\negthinspace[A]\negthinspace]) & \Leftrightarrow \\ \exists t''.t''\in[\negthinspace[A]\negthinspace] &\  \end{array}

So this is saying that if [\negthinspace[A]\negthinspace] is nonempty, then [\negthinspace[(A\to\perp)\to\perp]\negthinspace] is universal (contains all terms); and otherwise (if [\negthinspace[A]\negthinspace] is empty), [\negthinspace[(A\to\perp)\to\perp]\negthinspace] is empty.

So if you have a function F, say, realizing \neg\neg A)\to B for some B, what this means is that F can make use of the fact that A is true, but not how it is true.  If F were simply given a proof of A, then it could compute with this proof.  But here, with \neg\neg A, the proof of A is hidden behind an existential quantifier (as we deduced above), and so it cannot be used.  Only if someone is reasoning at the metalevel about F, they can make use of the fact that A is true when checking that the body of F realizes B.

So semantically, \neg\neg A is news you can’t use: we know A is true, but we have no access to how it is true.  For example, if A is a disjunction, you cannot case split, within the type theory, on the disjuncts.  Only at the metalevel are you entitled to consider the behavior of a term in light of the fact that one or the other of the disjuncts must hold.

Last week I had the great pleasure of visiting the amazing PL group at Indiana University.  They are doing so much inspiring research in many different areas of programming languages research.  It was great.  The collective knowledge there was very impressive, and I got a lot of great references to follow up on now that I am back.

I gave two talks: one in the logic seminar, and one for the PL group.  In the first talk, one issue I was discussing was an argument, originally sketched by Michael Dummett and developed more fully by Charles Parsons in a paper called “The Impredicativity of Induction” (you can find a revised version of this paper in a book called “Proof, Logic, and Formalization” edited by Michael Detlefsen) — this argument claims to show that induction is impredicative.  Impredicativity, as many QA9 readers well know, is the property of a definition D of some mathematical object x that D refers to some collection or totality of which x is a member.  Philosophical constructivists must reject impredicative definitions, because they believe that the definition are constructing or creating the entities defined, and hence cannot presuppose that those entities already exist in the course of defining them.  This kind of philosophical constructivism must be distinguished from a kind of formal constructivism a type theorist is likely to care about, which I identify with a canonical forms property: every inhabitant of a type A should have a canonical form which is part of the essential nature of type A.  Computational classical type theory violates this formal canonicity property, because types like A \vee \neg A are inhabited by terms which are neither left nor right injections.

Here is Parsons’s argument as I understand (or maybe extrapolate?) it.  A philosophical constructivist must reject impredicative definitions, such as the kind one gives for the natural numbers in second-order logic (see my talk slides for this example definition).  As an alternative, one may try to say that the meaning of the natural number type is determined by introduction and elimination rules (similarly to the meanings of logical connectives).  The introduction rules are about the constructors of the type.  And the elimination rule is an induction principle for the type.  But then, Parsons and Dummett point out, this explanation of the natural number (or other) inductive type is not any better off with respect to impredicativity than the second-order definition, because the induction principle allows us to prove any property P of n, assuming as premises that Nat n holds and that the base and step cases of induction are satisfied for P.  But the predicate P could just as well be Nat itself, or some formula which quantifies over Nat.  The latter seems to provoke worries, which I do not understand.  It seems already bad enough that P could be Nat.  So just moving from a single formula defining Nat to a schematic rule defining Nat cannot save one from the charge of impredicativity.  Hence, for avoiding impredicativity, there is no reason to prefer one approach (specifically, the rule-based approach) over the other.

The alternative to this is to reject the idea that we must give a foundational explanation of the natural-number type (or predicate).  We just accept the idea of numbers and other datatypes as given, and then induction is a consequence — not part of some definition of numbers, which we are refusing to give.  This is a coherent way to avoid impredicativity, but at the cost of having to accept numbers with no deeper analysis — something that a logicist might not like, for example (though I do not know for sure how a logicist would likely view this situation).

In our seminar for the Computational Logic Group at Iowa, we have been studying Logic Programming for the past couple semesters.  We are continuing this semester, and have been reading “The Stable Model Semantics for Logic Programming” by Gelfond and Lifschitz (available on Citeseer here), which is a classic paper describing a semantics for logic programs with negation.  I was interested to read this paper, as I am interested in datatypes in type theory and programming languages with negative occurrences (in the types of the constructors) of the type being defined.  I am hoping that maybe some ideas from LP could possibly carry over to semantics for such datatypes.  This is intellectually a bit of a longshot, but the paper is quite nice reading for anyone interested in computational logic.

Anyway, I want to record here a few notes from the discussion this past Friday, of Alain Mebsout, Nestan Tsiskaridze, Ruoyu Zhang, Baoluo Meng, and myself. First, some background from the paper.  A clause is an implication where there is a conjunction, possibly empty, of literals (either atomic formulas or negated atomic formulas) in the antecedent of the implication, and a single atomic formula as the consequent.  The authors work just with ground (i.e., variable-free) clauses, since a set of nonground clauses can be modeled as the (possibly infinite) set of all its ground instances.  A program Pi is a set of clauses.  The authors define an operation which I will denote here Pi/M, where M is a set of atomic formulas.  The operation drops “not A” from the formula for every atom A which is not in M.  And it drops every clause that contains “not A” (in the antecedent) where A is in M.  The intuition is that we can pretend M is exactly the set of atoms we know to be true, and then simplify the clauses of Pi by looking at negated literals.  If you negate an atom which is in M, then you are negating something which we are pretending is true, and hence the negation is definitely false.  A clause containing such a literal is satisfied if we view M as determining a model (namely, the term model where exactly the atoms in M are made true by the interpretations of the predicates used in the atoms).  So those clauses do not add any further information and can be dropped.  From an operational perspective, such clauses would give rise to unsatisfiable (in M) subgoals, and so cannot possibly help to prove anything.  Similarly, if a clause contains “not A” where A is not in M, then the literal “not A” is true in M, and hence cannot contribute anything as an antecedent of an implication.  So that is why it is dropped from the clause.

If after performing this simplification the minimal set of atoms satisfying Pi is exactly M, then M is called a stable set for Pi.  If we suppose the atoms in M are exactly the true ones and simplify away all negated literals, then the resulting formula’s minimal model is again M.  So our assumed knowledge M is consistent with what we can deduce from it from the program.  The paper gives examples of programs which have no stable sets, and which have more than one minimal stable set (i.e., a stable set none of whose subsets is stable).

On Friday, we worked through the proof of Theorem 1 of the paper, which states that if M is a stable set of Pi, then it is also a minimal model of Pi.  This led us to consider a pair of propositions (which we did not take from the paper).  Suppose M is a stable set of Pi, and suppose M’ is another set of atoms.

Proposition 1: if M’ satisfies Pi/M, then M’ satisfies Pi.

Proposition 2: if M’ satisfies Pi, then M’ satisfies Pi/M.

We delighted ourselves by managing to refute Proposition 2 and prove Proposition 1.  For the refutation: a counterexample is if M is { p }, M’ is { q }, and Pi is {“p if not q”}.  Pi/M is then just { p }, since we will drop the “not q” from the antecedent of “p if not q”.  So M is stable, since the minimal model of Pi/M is M.  M’ does satisfy Pi, because by making q true, M’ makes “p if not q” vacuously true (since “not q” is false).  But M’ does not satisfy Pi/M, since it does not make “p” true.

For the proof of Proposition 1, we considered an arbitrary clause “A if As and not Bs” in Pi (where I am writing As for a vector of atoms and “not Bs” for a vector of negated atoms Bs).  Suppose that clause does not have a corresponding one in Pi/M.  This means that some Bi must be in M, since the operation Pi/M drops a clause if it contains a literal “not B” where B is in M.  Since M is a stable model, Pi/M must entail Bi.  If not, M would not contain Bi.  Since M’ satisfies Pi/M by assumption for Proposition 2, this means Bi is also in M’, and hence M’ also satisfies the clause we are considering from Pi.

Now suppose that the clause “A if As and not Bs” from Pi does have a corresponding clause in Pi/M.  The clause in Pi/M must be just “A if As”, since the transformation which retains a clause going from Pi to Pi/M just drops all negated literals from the clause.  Since M’ satisfies Pi/M by assumption, it satisfies this clause “A if As”.  Weakening the antecedent by conjoining in some more literals cannot change that fact. So M’ satisfies the original clause “A if As and not Bs” from Pi.  So M’ satisfies every clause in Pi, as required by Proposition 2.

Most QA9 readers will know the nonconstructive proof that there are two irrational numbers a and b such that a to the power b is rational.  There is interesting discussion about this here.  I recently came across a 2014 note by Roger Hindley about the history of how this example came to be more broadly known.  It is interesting reading.  Hindley credits a 1953 paper by Dov Jarden in Scripta Mathematica (volume 19, page 229) with the proof.  In case you do not have access to this journal in a local library, I have decided to include a scanned copy of the article below:

dov-jarden-1953

It’s a Math tweet from 1953.

Everyone knows classical logic is nonconstructive. If a classical principle like \forall x. \phi(x) \vee \neg \phi(x) were true constructively, it would mean we would have an effective (i.e., computable) way to tell, for any value x, whether or not \phi(x) is true or false. When \phi(x) expresses a property like “the Turing machine coded by natural number x halts”, then we can see that \forall x.\phi(x) \vee \neg \phi(x) cannot be constructively true in general. Case closed.

But a funny thing happened on the way from Amsterdam: computational interpretations of classical type theory were devised, like the \lambda\mu-calculus of Parigot and quite a few others.  These type theories have term notations for proofs of classical logic, and give a reduction semantics showing how to simplify proofs to normal forms, similar to standard reduction semantics for good old \lambda-calculus.  The reduction semantics is perfectly effective, so one really can compute — or so it would seem — with terms in these type theories.  If the essence of constructivism is computation, then it would seem that classical logic is constructive.  Indeed, titles like that of Girard’s 1991 article “A new constructive logic: classical logic” would suggest this.

But you know, maybe it is intuitionistic logic which is nonconstructive.  Come again?  Well, the past half year or so several of us here at Iowa have been studying so-called bi-intuitionistic logic.  This is a logic that extends standard propositional intuitionistic logic with an operator called subtraction, which is dual to implication.  There are a number of papers about this topic I could recommend.  Maybe the easiest introduction is this one by Goré, although we ultimately followed the approach in this paper by Pinto and Uustalu.   Recall that in the Kripke semantics for propositional intuitionistic logic, the implication A \to B is true iff in all future worlds where A is true, B is also true. The subtraction A - B  has a dual semantics: it is true iff there is an earlier world where A is true and B is false.  In intuitionistic logic, one usually defines \neg A as A \to \perp.  This negation is true iff A is false in all future worlds.  In bi-intuitionistic logic, one can also define a second kind of negation, using subtraction: \sim A means \top - A, and it is true iff there exists an earlier world where A is false.  With this definition, we start to see classical principles re-emerging.  For example, A \vee \sim A holds, because in any world w of any Kripke model, either A is true in w, or there is some earlier world (possibly w) where A is false.  Valid formulas like A \vee \sim A violate the disjunction property of intuitionistic logic, since neither A nor \sim A holds, despite the fact that the disjunction does hold.  The disjunction property, or in programming-languages terms, canonicity, would seem to be as much a hallmark of constructive reasoning as the law of excluded middle is of classical reasoning.  Yet here we have a conservative extension of intuitionistic logic, based on exactly the same semantic structures, where a form of excluded middle holds and canonicity fails.

What are we to make of this?  Let us think more about our Kripke semantics for (bi-)intuitionistic logic.  In this semantics, we have a graph of possible worlds, and we define when a formula is true in a world.  Formulas that are true stay true (the semantics ensures this), but formulas which are false in one world can become true in a later world.  The critical point is that we have, in a sense, two notions of truth: truth in a world, and truth across time (or across possible worlds).  Truth in a world is usually thought of as being defined classically.  Maybe it is better to say that the modality with which it is defined is not made explicit.  A textbook definition of Kripke semantics is making use of some background meta-language, and I think it is fair to say that without further explicit stipulation, we may assume that background meta-language is the language of informal classical set theory.  So for any particular formula \phi, either \phi holds in world w, or else it does not.  This is, in fact, what justifies validity of A \vee \sim A.  Indeed, to construct a version of Kripke semantics where this is not valid would seem to require something like an infinitely iterated Kripke semantics, where truth in a world is constructively defined (the infinite iteration would be to ensure that the notion of constructive definition being used for truth in a world is the same as the one for truth across worlds).  An interesting exercise, perhaps!  But not one that will validate the laws of bi-intuitionistic logic, for example, where A \vee \sim A is indeed derivable.

So what is it that makes a logic constructive?  Let’s come back to the case of computational classical type theory.  The term constructs that were discovered to correspond to classical principles were not so exotic: they were actually forms of familiar control operators like call-cc from functional programming!  So, these constructs were already in use in the wild; only the connection to logic was a (remarkable!) discovery.  So every day, the working Scheme programmer, or Standard ML programmer, or Haskell programmer has at his disposal these constructs which correspond to nonconstructive reasoning.  But they are writing code!  It is computing something!  What is going on?

Perhaps it is in Haskell where this is clearest.  To use call-cc, one must import Control.Monad.Cont.  And there we see it already: the control operator is only available in a monad.  The sequencing of operations which a monad is designed to provide is critical for taming call-cc from the point of view of lazy functional programming.  In the words of Simon Peyton Jones (from these notes):

Indeed it’s a bit cheeky to call input/output “awkward” at all. I/O is the raison d’etre of every program. — a program that had no observable effect whatsoever (no input, no output) would not be very useful.

And that’s just where the interaction of control operators with pure functional programming seems strange.  Phil Wadler, in his well-known dramatization of the computational content of the law of the excluded middle, has the devil give a man a choice: either the devil pays him one billion dollars, or if the man can pay the devil one billion dollars, the devil will grant him one wish of whatever he wants.  The man accepts the choice, and the devil goes with option 2.  After a life of greed and antisocial accumulation, the man present the devil with one billion dollars, hoping for his wish.  The devil (using a control operator, not that they are intrinsically evil) goes back in time, so to speak, to change the offer to option 1, returning immediately to the man his one billion dollars.

I am reaching the conclusion that it is not the non-canonicity of the law of excluded middle that is the locus of non-constructivity.  It is the lack of a notion of the progression of time.  The devil (indeed, even God Himself) cannot change the past — despite numerous Hollywood offerings (Back to the Future, anyone?) based on the idea of changing the past — but without some restrictions, control operators can revert to an earlier time, thus potentially undoing some of the effects that have already been carried out.  As Peyton Jones is saying, if the computation just spins its wheels and does not interact at all with the outside world, then there is not really any notion of time or cause and effect which could be violated by the use of a control operator.

Another way to think of it: classical logic is the language of mathematics, where everything is considered as a static entity (notwithstanding the fact that dynamic entities like computations can be statically described).  Constructive logic is supposed to have a dynamic character: it should capture the idea, somehow, that the state of information can change over time.

So where I am thinking to go with all this on the language-design side, is rather than developing a type theory based on bi-intuitionistic logic (which we have been hard at work on doing for many months now, and which is likely a good research project anyway), to develop instead a type theory based on classical modal logic, where classical reasoning, and its corresponding type theory with full control operators, can be used at any given time, but not across time.  Reasoning (or computation) across time will be governed by the rules of a modal logic, likely S4 for reflexivity and transitivity of the modality.  Observables operations like input and output will require stepping into the future, from which no control operator can return the flow of control.  Such a type theory would provide a new approach to pure functional programming with effects: at any given instant of time, one could perform reads and writes (for example), possibly backtracking with control operators to cancel some of them out.  But when one wants to step forward in time, the most recent operations (particularly writes) are committed, and become irrevocable.  Such a type theory would hold out the promise of combining classical reasoning (at a particular time) with constructive computation (across time).  If we think of general recursion as computation over time — with each general recursive call requiring a step into the future — we can even combine Turing-complete computation with logically sound classical reasoning in this way.  That would be almost the holy grail of verified programming I have been working towards for the past 7 years or so: a single language supporting sound classical reasoning and general-purpose programming.

So to conclude: what is so nonconstructive about classical logic?  I believe it is not the failure of canonicity, but rather the lack of a notion of progression of time.  Devising type theories with this perspective in mind may bring new solutions to some old conundrums of both type theory and functional programming.  We will see.