Greetings, QA9 reader(s).  It has been a while since I posted, for which I offer the usual excuses.  I have a backlog of interesting technical things I’m hoping to write about sometime pretty soon: failure of subject reduction in lambda2 with eta-contraction, which I learned about maybe a year ago but did not record in a post; also intriguing things I am reading about deep inference, although there the problem is the stuff is, well, deep and I haven’t understood it well enough yet to have too much to say.  I also had high hopes of posting about the super fun summer I had here at U. Iowa working on Cedille with great new U. Iowa graduate students Chad Reynolds, Ananda Guneratne, and Richard Blair; great visiting Indiana doctoral student Matthew Heimerdinger; and inspiring visitors Stéphane Graham-Lengrand, Olivier Hermant, and Vincent St-Amour.  In a perfect world I would have some pictures to post of all this stimulating interaction, but here I am photo-less.

But what impels me to write today is getting bitten again by performance problems with well-founded recursion in Agda.  Just for fun, I will tell you the story.  We are working on this new interactive theorem prover called Cedille based on dependently typed lambda encodings.  You can read about this in the project report we just revised to submit for the TFP 2016 post-proceedings.  Thanks to the determination of Matthew (Heimerdinger), we have a highlighting mode — actually several — for Cedille in our emacs interface.  One funny wrinkle was that we could not highlight comments, because our parser was just dropping them, and we did not want to store information about them in our abstract syntax trees (which would then become very cluttered).  We hit upon a daring (for us, anyway) solution: parse input files a second time, with a different grammar that is recognizing just whitespace and comments.  It is nice to find the whitespace, too, in addition to the comments, since then we can set highlighting to the default for those spans of the text in the emacs mode.  Otherwise, with the way our interaction between emacs frontend and Agda backend works (see that revised project report), if you type in whitespace you will likely get a non-default highlighting color, from the span containing that whitespace.

Ok, so we (Richard) wrote the grammar for this, and we started producing comment and whitespace spans to send to the frontend, only to see some very sluggish performance on a couple longer files.  We misdiagnosed this as coming from doing lots of string appends (with what correspond to slow strings in Haskell; i.e., lists of characters), and I dutifully changed the code to print the spans to a Haskell Handle, rather than computing one big string for all the spans and then printing that.  Distressingly, this optimization, which took maybe an hour or so to implement, with finding the right Haskell stuff to hook up to, did not improve performance.  Well shucks!  Let’s get out our handy Haskell profiler and see what is going wrong.  Now sadly, I am still on Agda 2.4.2.4, which lacks this nifty agda-ghc-names tool that is part of Agda 2.5.1 now.  That tool translates Haskell names that are generated by the Agda compiler, back to names from the Agda source.  But this tool was not needed to see the following:

COST CENTRE MODULE %time %alloc

d179.\.\.\ MAlonzo.Code.QnatZ45Zthms 35.9 18.8
d179.\ MAlonzo.Code.QnatZ45Zthms 11.6 48.8
d191.\ MAlonzo.Code.QnatZ45Zthms 9.6 7.5
d179 MAlonzo.Code.QnatZ45Zthms 5.7 0.0
d191 MAlonzo.Code.QnatZ45Zthms 5.2 0.0
d179.\.\ MAlonzo.Code.QnatZ45Zthms 4.7 11.3

Oh pondwater.  Over half the time is being spent in functions in nat-thms.agda in the IAL.  That file contains, as the name suggests, just theorems about the natural numbers — not code I expect to be running intensively during execution of the Cedille backend.  But I have seen this before, as I noted.  These are lemmas getting called again and again during a well-founded recursion, to show that certain less-than facts hold.  Which well-founded recursion was it?  It turns out it was for the function that converts a natural number to a string.  We are doing that operation many times, to print character positions in the input Cedille source file.  Just removing the well-founded recursion and marking the nat-to-string function as needing NO_TERMINATION_CHECK reduced the running time to less than half what it was.

The moral of this story is the same as the moral of the previous story: be wary of well-founded recursion in Agda for code you intend to run (as opposed to just reason about).

Have a blessed start of fall!

Last week I had the great pleasure of visiting the amazing PL group at Indiana University.  They are doing so much inspiring research in many different areas of programming languages research.  It was great.  The collective knowledge there was very impressive, and I got a lot of great references to follow up on now that I am back.

I gave two talks: one in the logic seminar, and one for the PL group.  In the first talk, one issue I was discussing was an argument, originally sketched by Michael Dummett and developed more fully by Charles Parsons in a paper called “The Impredicativity of Induction” (you can find a revised version of this paper in a book called “Proof, Logic, and Formalization” edited by Michael Detlefsen) — this argument claims to show that induction is impredicative.  Impredicativity, as many QA9 readers well know, is the property of a definition D of some mathematical object x that D refers to some collection or totality of which x is a member.  Philosophical constructivists must reject impredicative definitions, because they believe that the definition are constructing or creating the entities defined, and hence cannot presuppose that those entities already exist in the course of defining them.  This kind of philosophical constructivism must be distinguished from a kind of formal constructivism a type theorist is likely to care about, which I identify with a canonical forms property: every inhabitant of a type A should have a canonical form which is part of the essential nature of type A.  Computational classical type theory violates this formal canonicity property, because types like $A \vee \neg A$ are inhabited by terms which are neither left nor right injections.

Here is Parsons’s argument as I understand (or maybe extrapolate?) it.  A philosophical constructivist must reject impredicative definitions, such as the kind one gives for the natural numbers in second-order logic (see my talk slides for this example definition).  As an alternative, one may try to say that the meaning of the natural number type is determined by introduction and elimination rules (similarly to the meanings of logical connectives).  The introduction rules are about the constructors of the type.  And the elimination rule is an induction principle for the type.  But then, Parsons and Dummett point out, this explanation of the natural number (or other) inductive type is not any better off with respect to impredicativity than the second-order definition, because the induction principle allows us to prove any property P of n, assuming as premises that Nat n holds and that the base and step cases of induction are satisfied for P.  But the predicate P could just as well be Nat itself, or some formula which quantifies over Nat.  The latter seems to provoke worries, which I do not understand.  It seems already bad enough that P could be Nat.  So just moving from a single formula defining Nat to a schematic rule defining Nat cannot save one from the charge of impredicativity.  Hence, for avoiding impredicativity, there is no reason to prefer one approach (specifically, the rule-based approach) over the other.

The alternative to this is to reject the idea that we must give a foundational explanation of the natural-number type (or predicate).  We just accept the idea of numbers and other datatypes as given, and then induction is a consequence — not part of some definition of numbers, which we are refusing to give.  This is a coherent way to avoid impredicativity, but at the cost of having to accept numbers with no deeper analysis — something that a logicist might not like, for example (though I do not know for sure how a logicist would likely view this situation).

I am giving a talk this afternoon titled “From Logic with Love” as part of a special class organized by U. Iowa English Prof. Judith Pascoe, where professors from different departments give talks from the perspective of their discipline on the topic “What We Talk About When We Talk About Love”.  The slides are here.  I am also including some links and notes below, for students interested in further reading.

The short story Beginners, by Raymond Carver, can be found here.  This is the original version of the story that was published as “What We Talk About When We Talk About Love”.

Ray Monk’s biography of Bertrand Russell, while loathed by admirers of Russell, is extremely interesting and revealing reading.

Constance Reid’s biography of David Hilbert is also very interesting and enjoyable reading.  It includes a reprint of much of the text of his famous 1900 Paris lecture on future problems of mathematics.

Some Computer Science news articles related to the talk:

Other related articles:

A Man For Others by Patricia Treece is an excellent read about Maximilian Kolbe.  It contains first-hand reports and recollections from many different people who knew the saint.

When you hear “Craig’s Theorem”, you probably think of the Craig Interpolation Theorem, and rightly so.  This is the most famous theorem (I suppose) of philosopher and logician William Craig.  So I was surprised to find out there is another “Craig’s Theorem” that is even cooler for those of us interested in lambda calculus.  It is well known that every lambda term can be simulated using a combinator term written using combinators S and K.  Have you ever wondered if there is a way to simulate all lambda terms using just a single proper combinator Q (proper here means that the combinator is defined by a reduction rule showing how to reduce $Q\ x_1\ \ldots\ x_k$ to some term built from applications of the variables $x_1, \ldots, x_k$)?  Craig’s (other) Theorem says no.  There is a very short (2 pages) proof given here (sorry, this is behind a paywall, I think).  I can give a simple variation of this proof here.

Suppose there were such a combinator Q.  Since it is, by assumption, combinatorily complete, there must be some way to write the identity combinator I using just Q.  Consider the reduction sequence $I\ x\ \to^* M\ \to\ x$, where $M\ \to \ x$ is the last step in this reduction sequence (which must be of length at least one since $I\ x$ is not the same as $x$).  Since $M$ reduces, it must be of the form $Q\ c_1\ \ldots\ c_n$ for some combinators $c_1,\ldots,c_n$.  But this tells us that $Q$ must be projecting one of the inputs $c_1,\ldots,c_n$ to the final result (and one of those $c_i$ must be $x$).  So $Q$ is a projection combinator.   This part is from the linked paper by Bellot.  Here is now an alternative conclusion.  If a sequence of reductions consists solely of projections of inputs, then that sequence must be terminating, because at each step, the size of the term is decreased by doing a projection.  But the lambda calculus includes nonterminating terms, so a projection function cannot be a sufficient basis for the lambda calculus.  Hence there is no one-combinator basis for the lambda calculus (and we need at least two, such as S and K).

Fun stuff!

In our seminar for the Computational Logic Group at Iowa, we have been studying Logic Programming for the past couple semesters.  We are continuing this semester, and have been reading “The Stable Model Semantics for Logic Programming” by Gelfond and Lifschitz (available on Citeseer here), which is a classic paper describing a semantics for logic programs with negation.  I was interested to read this paper, as I am interested in datatypes in type theory and programming languages with negative occurrences (in the types of the constructors) of the type being defined.  I am hoping that maybe some ideas from LP could possibly carry over to semantics for such datatypes.  This is intellectually a bit of a longshot, but the paper is quite nice reading for anyone interested in computational logic.

Anyway, I want to record here a few notes from the discussion this past Friday, of Alain Mebsout, Nestan Tsiskaridze, Ruoyu Zhang, Baoluo Meng, and myself. First, some background from the paper.  A clause is an implication where there is a conjunction, possibly empty, of literals (either atomic formulas or negated atomic formulas) in the antecedent of the implication, and a single atomic formula as the consequent.  The authors work just with ground (i.e., variable-free) clauses, since a set of nonground clauses can be modeled as the (possibly infinite) set of all its ground instances.  A program Pi is a set of clauses.  The authors define an operation which I will denote here Pi/M, where M is a set of atomic formulas.  The operation drops “not A” from the formula for every atom A which is not in M.  And it drops every clause that contains “not A” (in the antecedent) where A is in M.  The intuition is that we can pretend M is exactly the set of atoms we know to be true, and then simplify the clauses of Pi by looking at negated literals.  If you negate an atom which is in M, then you are negating something which we are pretending is true, and hence the negation is definitely false.  A clause containing such a literal is satisfied if we view M as determining a model (namely, the term model where exactly the atoms in M are made true by the interpretations of the predicates used in the atoms).  So those clauses do not add any further information and can be dropped.  From an operational perspective, such clauses would give rise to unsatisfiable (in M) subgoals, and so cannot possibly help to prove anything.  Similarly, if a clause contains “not A” where A is not in M, then the literal “not A” is true in M, and hence cannot contribute anything as an antecedent of an implication.  So that is why it is dropped from the clause.

If after performing this simplification the minimal set of atoms satisfying Pi is exactly M, then M is called a stable set for Pi.  If we suppose the atoms in M are exactly the true ones and simplify away all negated literals, then the resulting formula’s minimal model is again M.  So our assumed knowledge M is consistent with what we can deduce from it from the program.  The paper gives examples of programs which have no stable sets, and which have more than one minimal stable set (i.e., a stable set none of whose subsets is stable).

On Friday, we worked through the proof of Theorem 1 of the paper, which states that if M is a stable set of Pi, then it is also a minimal model of Pi.  This led us to consider a pair of propositions (which we did not take from the paper).  Suppose M is a stable set of Pi, and suppose M’ is another set of atoms.

Proposition 1: if M’ satisfies Pi/M, then M’ satisfies Pi.

Proposition 2: if M’ satisfies Pi, then M’ satisfies Pi/M.

We delighted ourselves by managing to refute Proposition 2 and prove Proposition 1.  For the refutation: a counterexample is if M is { p }, M’ is { q }, and Pi is {“p if not q”}.  Pi/M is then just { p }, since we will drop the “not q” from the antecedent of “p if not q”.  So M is stable, since the minimal model of Pi/M is M.  M’ does satisfy Pi, because by making q true, M’ makes “p if not q” vacuously true (since “not q” is false).  But M’ does not satisfy Pi/M, since it does not make “p” true.

For the proof of Proposition 1, we considered an arbitrary clause “A if As and not Bs” in Pi (where I am writing As for a vector of atoms and “not Bs” for a vector of negated atoms Bs).  Suppose that clause does not have a corresponding one in Pi/M.  This means that some Bi must be in M, since the operation Pi/M drops a clause if it contains a literal “not B” where B is in M.  Since M is a stable model, Pi/M must entail Bi.  If not, M would not contain Bi.  Since M’ satisfies Pi/M by assumption for Proposition 2, this means Bi is also in M’, and hence M’ also satisfies the clause we are considering from Pi.

Now suppose that the clause “A if As and not Bs” from Pi does have a corresponding clause in Pi/M.  The clause in Pi/M must be just “A if As”, since the transformation which retains a clause going from Pi to Pi/M just drops all negated literals from the clause.  Since M’ satisfies Pi/M by assumption, it satisfies this clause “A if As”.  Weakening the antecedent by conjoining in some more literals cannot change that fact. So M’ satisfies the original clause “A if As and not Bs” from Pi.  So M’ satisfies every clause in Pi, as required by Proposition 2.