## Category Archives: Great papers

I recently had occasion to read two papers by philosopher and logician Neil Tennant.  These were easily the most thought-provoking papers on philosophy of logic that I have read in the past two years.  One I found absolutely compelling, the other I regret to say I found to be completely and spectacularly wrong.  The writing in both was elegant and often witty, a charming change from the more workman-like efforts of us less cultured Computer Scientists.  The papers are:

I have not read other things by Tennant, having only just discovered that I likely would enjoy doing so.  He seems to be a prolific writer, so I am in luck.  I chose these by title to take with on a short trip, and am very glad I did so.

The title of the first of these is really inspiring — more so even the contents.  Despite my college education in Philosophy, I never studied Kant, nor too much of the analytic-synthetic distinction.  As Tennant summarizes it, the question is whether the truth or falsity of a sentence is determined solely by its (intrinsic) meaning, with no further contribution from the way the external world is, or whether the way the external world is contributes to determining the truth or falsity of the sentence.  In the first case (no contribution from the world) the sentence is called analytic.  In the second (some contribution from the world), it is synthetic.  Now, little as I know about this distinction, it is easy to see from the concepts in question, not to mention the custom of philosophers of exploring issues like this from a huge number of angles with endlessly nuanced arguments and positions, that this is a delicate and potentially complex matter.  But it was certainly an informative simple summary for me.

The main point of “The Law of Excluded Middle is Synthetic A Priori, If Valid” is that strictly classical principles like the law of excluded middle (LEM, $p \vee \neg p$), which Tennant says presume a meta-principle of bivalence — that every sentence is either true or false — are synthetic formulas.  They are a priori if valid, in Tennant’s view, in the sense that they are known absolutely independently (in principle?) of empirical experience.  The point is that there may be some additional metaphysical reasons for believing in bivalence, which would justify acceptance of strictly classical principles.  But there is nothing in the inherent meaning of the logical connectives that make sentences like LEM valid, if one assumes that the meaning of a logical connective is determined by its introduction and elimination rules.  Tennant seems skeptical that there is a real metaphysical justification for bivalence and hence LEM, but he is willing to allow the classical reasoner to persist in his bivalent thinking.  What he is not willing to allow is that such thinking is analytic: classical inferences should be explicitly acknowledged as synthetic.

Tennant’s position on strictly classical principles seems pretty compelling if one adopts a proof-theoretic view of logic, because at least for natural deduction systems (which Tennant seems to favor), classical principles are an add-on to the system, and do not follow the usual pattern of specifying the meaning of a logical connective via introduction and elimination rules. One simply has to postulate LEM as an axiom, or add a principle of proof by contraction (to prove $A$, assume $\neg A$ and derive a contradiction). So those principles do not seem to arise as an intrinsic part of the meaning of the logic connectives. Tennant’s position gives us a way to live with this phenomenon: we must just realize that strictly classical principles do not arise intrinsically from the meanings of the logical connectives because those principles really are extra-logical assumptions. The world may be determinate in the sense of bivalence, and it is at least conceivable that it should fail to be so. If we want to proceed under the assumption that it is, we need to make this assumption explicit in our reasoning.

We can perhaps strengthen this line of reasoning if we think of logic not just as a tool for thinking about the real world, which may or may not be determinate, but as a tool for reasoning within theories, which we know from metamathematics can certainly be incomplete and hence fail to satisfy bivalence. Take the fact that the Continuum Hypothesis (CH) is independent of Zermelo-Frankel set theory with the axiom of choice (ZFC), a celebrated result proved in two halves by Kurt Goedel and Paul Cohen. This shows that ZFC is indeterminate with respect to CH. A more basic example is Goedel’s theorem that first-order arithmetic is incomplete. For reasoning in such theories, perhaps it is more faithful to avoid LEM as part of the logic. If we think of the theories as describing only so much of the imaginable mathematical universe, then equipping them with LEM seems wrong. ZFC does not tell us whether CH is true or CH is false. Why should we postulate that despite this indeterminacy, one or the other of CH and its negation must be true? This seems to suggest a view of mathematical theories as being about all of some pre-existing mathematical reality, where there really is a truth of the matter about CH, whether the theory tells it to us or not. But given the multiple incompatible mathematical universes one can describe with a logical theory — for example, there are consistent extensions of ZFC where CH it true and other (incompatible) extensions where it is false — it has always seemed strange to me to assume there is a single pre-existing mathematical reality that is being described by different theories. In this sense, I consider myself an anti-realist about mathematics.

But there are a couple challenges we could mount to Tennant’s position that LEM, as an appendage to the usual natural deduction proof system for (say) first-order logic, should be regarded as synthetic instead
of analytic. First, we could question the choice of natural deduction as the proof system to use for deciding what is intrinsic to a logical connective and what is not. For example, it is well-known that if we use an unrestricted multi-succedent sequent calculus, we can derive classical principles like Peirce’s law just from the rule for assumptions and the left and right rules for implication. Here is  a derivation:

If we are willing to pass from minimal to intuitionistic logic (with a rule of explosion — also called ‘ex falso quodlibet’, or ‘from false, whatever’ — saying that from $\perp$ we can conclude any formula $A$), we can derive $((A \to \perp) \to \perp) \to A$ with a similar derivation. We have used no additional axioms embodying strictly classical principles. The unrestricted right rule for implication has sneakily included a subtle form of classical reasoning into the proof system: so subtle it is not so easy to see what its source is! One explanation for what is happening is that that rule is not compatible with the Kripke semantics for intuitionistic (or minimal) logic, where positive knowledge must persist in future worlds, but negative knowledge need not. We can view the presence of $A$ on the right as an assumption that $A$ is false, and hence we should not be allowed to use that assumption when we step to a new world where we are assuming $A$ is true. But this depends on a particular semantics for the logic. If we are taking the logical rules as primitive, what stops us from adopting this perspective? Likely there is some other way to find bivalence hidden in the basic form of sequents and the rules acting on them (though I do not know what it is).

The other challenge I see to Tennant’s view is simply to deny that the proof-theoretic approach to the semantics of the logical connectives is the right one. After all, the usual boolean semantics seems pretty
natural! I assume there are sophisticated answers to be made to this, so I will leave it be for now, and turn briefly to the second paper.

“Negation, Absurdity, and Contrariety” is largely concerned with denying that $\perp$ (falsity) has a legitimate role in logical formulas. Tennant asks “Is $\perp$ a genuine propositional constant, eligible to be a constituent of other sentences?” The succinct answer he offers is “On the face of it, no; unless $\perp$ is identified with a particular absurd sentence of an already interpreted language.” He goes on to offer reasonably convincing arguments against accepting $0 = 1$ as a definition of $\perp$ (largely because now the notion of falsity is tied to some particular theory, in this case arithmetic). Now, I think we can make Tennant’s job more difficult here, by choosing a formula which is not tied to a particular theory, to be our definition of $\perp$. For example, why not take the formula $\forall x. \forall y. x = y$? Even if we do not think that equality should be considered a logical connective, it does not seem tied to a particular theory. Certainly many theories besides just arithmetic rely on a notion of equality.  So perhaps this definition of $\perp$ can better resist a charge of being overly theory-dependent. Or if that is not abstract enough, let us just switch to a higher-order logic. There, it is standard to define $\perp$ as $\forall F. F$. This definition immediately gives rise to the ex falso quodlibet rule of inference mentioned above), since if $\forall F. F$ is true, we can conclude any formula $A$ by instantiating $F$ with $A$. This is a very strong notion of falsity, as Tennant is also noting (since it implies that a contradiction in one part of a theory automatically infects all other reasoning one might do).

Frank Fu has proposed a different notion of falsity, that has better properties when used in type theory. We can define Leibniz equality $x =_T y$ of terms $x$ and $y$ of type $T$ to be $\forall C:T \to \textit{Prop}. C\ x \to C\ y$. So we are saying that $x$ equals $y$ if all truths about $x$ are also truths about $y$. Now define $\perp$ to be $\forall T:\textit{Type}. \forall x : T. \forall y : T. x =_T y$. Using this definition, Frank has shown we can derive $0 \neq 1$ in pure type theory, where $0$ and $1$ are Church-encoded.

Now all these points aside, I come to my main complaint about this second paper of Tennant’s. He asks whether $\perp$ is a genuine propositional constant that one should be allowed to use in object-language sentences. He devotes significant space to arguing against defining $\perp$. I’d be willing to grant these arguments. But the real question for me is whether or not it makes sense to take $\perp$ as an undefined logical constant. And here, Tennant offers very skimpy arguments. He just says that whenever someone says something like “Contradiction!” or “That’s absurd!”, she is offering a metalinguistic commentary on whether a contradiction can be derived from some set of assumptions. And from this claim, in just a few sentences of consideration, Tennant confidently dismisses the idea of taking $\perp$ as a
propositional constant.

I find this completely inadequate as an argument against $\perp$ as a logical constant. After all, the role of object-language formulas is to reflect certain metalinguistic realities. This is certainly so from Tennant’s constructivist perspective, which emphasizes a tight agreement between the meanings of logical constants and metalinguistic inference rules. I don’t see how this same objection that Tennant is mounting — that something cannot be a logical connective if all it is doing is reflecting metalinguistic commentary — cannot be applied to any formula construct! One could just as well say that when someone says $A_1 \wedge A_2$, all he is really doing is making a metalinguistic statement about derivability of both $A_1$ and $A_2$. So we should not use any object language at all, but speak only in our metalanguage!
(And how is its meaning to be determined?)

A variation on my objection to Tennant: truth and falsity play quite a dual role in many different formulations of logic. How is Tennant’s objection not to be dually applied to rule out the use of $\top$? (Maybe he would urge this — but then I am surprised he takes aim just at $\perp$, and neglects $\top$).

Anyhow, much as I repudiate Tennant’s conclusion in this second paper, I found it as enjoyable and interesting reading as the first, and commend both to you.

I have been meaning to post about this wonderful paper, which you can read here, for quite some time now.  It has sealed a spot as one of my two favorite papers of 2010.  In it, Nao Hirokawa and the incredibly productive Aart Middeldrop (I chaired a session at IJCAR 2010 consisting solely of papers co-authored by Aart — wow) give a new criterion for confluence, based on relative termination.  As the abstract to the paper says:

In this paper we use the decreasing diagrams technique to show that a left-linear term rewrite system R is confluent if all its critical pairs are joinable and the critical pair steps are relatively terminating with respect to R.

I mentioned relative termination in a post earlier this year (which was supposed to provide background for talking a little about this paper, but I didn’t manage to post about it until now).  A term-rewriting system R is relatively terminating with respect to another S iff the relation $\rightarrow_{R/S}$ defined to be $\rightarrow_S^* \cdot \rightarrow_R \cdot \rightarrow_S^*$ is terminating.  This of course implies that R is terminating, since $\rightarrow_R \subseteq \rightarrow_{R/S}$: if we forbid an infinite sequence of R-steps where in between every two R-steps we are allowed also to have a finite number of S-steps, then we are certainly forbidding an infinite sequence of R-steps with no S-steps in between any two R-steps.  But we can easily describe systems R which are terminating but not terminating relative to some other systems S.

The criterion (for left-linear rewrite systems — that is, rewrite systems where no variable is allowed to appear twice on the left hand side of a rule, as in $\textit{eq}(x,x) \rightarrow \textit{true}$) that all critical pairs are joinable and the rewrite system consisting of critical pair steps is relatively terminating with respect to R generalizes the two best known confluence criteria: the Knuth-Bendix criterion, which requires joinability of critical pairs (I’ll explain those in a moment) for terminating rewrite systems; and the orthogonality criterion, which requires the absence of overlaps in left-hand sides of rules, but does not require termination.  A critical pair of two (possibly the same) rewrite rules is the pair of terms you get from overlapping the two rules in a non-trivial way.  For example, consider the single rule $f(f(x,y),z) \to f(x,f(y,z))$, expressing associativity of an operation $f$.  This rule can be overlapped non-trivially with itself by unifying the whole term with the $f(x,y)$ subterm.  In this case, we get two critical pair steps: $f(f(f(x',x''),y),z) \rightarrow f(f(x',x''),f(y,z))$ and $f(f(f(x',x''),y),z) \rightarrow f(f(x',f(x'',y)),z)$.  The critical pair itself consists of the two right hand sides of these rules.  If the critical pair steps are terminating relative to the rewrite system R (and if the set of rules is left-linear), then Hirokawa and Middeldorp have shown that we get confluence.

The reason I am so impressed with this paper is that it generalizes these two well-known confluence criteria.  It generalizes the Knuth-Bendix criterion because it does not require termination (but still requires joinability of critical pairs).  It generalizes the orthogonality criterion because there, since there are no critical pairs, the set of critical pair steps is trivially terminating with respect to the rewrite system (since we have no critical pair steps to take, we cannot possibly diverge taking them).  Also, the proof uses decreasing diagrams, an advanced characterization of confluence in terms of diagrams due to Vincent van Oostrom.  So the technical machinery required is pretty sophisticated.

Below is a picture of Nao which I took right after his presentation, with a diagram he drew illustrating the essential idea of the theorem:

I recently happened upon a paper that provides some great background on a topic relevant for our discussion of termination analysis, namely “The Art of Ordinal Analysis” by Michael Rathjen (abstract and download information below).

I am not the first blog writer (see, for example, Logic Matters of a couple years back) to notice this really helpful paper, but I’ll recommend it a bit more expansively.  For termination analysis, background knowledge of proof theory is quite helpful.  Ordinal analysis is a highly technical branch of proof theory, which can appear completely inscrutable even to readers who are comfortable with intricate theoretical matters.  Very crudely, the idea is to identify the ordinal needed to show (by proof-theoretic means) that a logical system is consistent.  Proof-theoretic consistency proofs work by showing how to transform proofs in the proof system in question into ones in a simpler system, where it can be easily shown that it is impossible to derive a contradiction.  The critical problem is to prove that this transformation of proofs is a total function.  A typical way to prove that any function is total is to define some measure on its inputs, and show that this measure is decreased at every recursive call to the function.  Of course, the measure must be decreased in some well-founded ordering, and that is where ordinals come in.  An ordinal is a canonical set-theoretic representative of an order-isomorphism class of well-founded orderings.  I recommend the sections on ordinals in Wolfram Pohlers “Proof Theory: an Introduction”, although they are dense, and require very patient slow reading to get through.  But you will be rewarded with a better understanding of ordinals.

To return to the paper by Rathjen: the thing I like about it so much is that it so elegantly and simply surveys the dizzying technical landscape of ordinal analysis.  I have always been baffled when I’ve tried to get into that literature, because there are so many acronyms for different proof systems of interest.  Rathjen succinctly explains what many of these acronyms are, and why the corresponding systems are of interest.  The effort is much appreciated, since the reader can start to get a better grip on what these proof theorists are working on, and what the state of the field is.

How relevant is all this for Computer Science, and particularly type theory?  I don’t really know, because for type theory, we really are mostly interested in strong normalization (all reduction sequences are finite for every term: it is not possible to find a nondeterministic execution that runs forever) — or maybe even just weak normalization (there is a single finite reduction sequence for every term, although there might still be infinite ones for some terms).  For these theorems, ordinal analysis does not appear to be needed (at least in many important cases).  The reducibility method of Tait, extended to reducibility candidates by Girard, suffices to show strong normalization of 2nd-order intuitionistic logic (“System F”), a so-called impredicative system.  We’re getting a little ahead of ourselves with all this — and we will return to these topics later (or see also the discussion in Rathjen’s paper) — but suffice it to say that for ordinal analysis, it seems (from Rathjen’s article) that analyzing impredicative 2nd-order systems is at the limits of what can be done presently.  I do not know how System F relates to those other systems, nor do I know if there is an ordinal analysis of System F: even though Girard’s proof shows strong normalization, it does not — and this is surely a feature, if all one cares about is strong normalization — make use of ordinals.  These are “novice” questions from the point of view of proof theory, I’m sure; yet another indication of the depth of the field.

Anyhow, take a look at the Rathjen article for a glimpse into this proof-theoretic world.

The art of ordinal analysis

Michael Rathjen (1)

(1) Department of Mathematics, Ohio State University, 231 W 18th Avenue, OH 43210, COLUMBUS, UNITED STATES

Ordinal analysis of theories is a core area of proof theory whose origins can be traced back to Hilbert’s programme – the aim of which was to lay to rest all worries about the foundations of mathematics once and for all by securing mathematics via an absolute proof of consistency. Ordinal-theoretic proof theory came into existence in 1936, springing forth from Gentzen’s head in the course of his consistency proof of arithmetic. The central theme of ordinal analysis is the classification of theories by means of transfinite ordinals that measure their ‘consistency strength’ and ‘computational power’. The so-called proof-theoretic ordinal of a theory also serves to characterize its provably recursive functions and can yield both conservation and combinatorial independence results.

This paper intends to survey the development of “ordinally informative” proof theory from the work of Gentzen up to more recent advances in determining the proof-theoretic ordinals of strong subsystems of second order arithmetic.

Proceedings of the International Congress of Mathematicians