## Category Archives: Termination analysis

While working over the weekend with Chris Casinghino and Vilhelm Sjöberg at U. Penn. and Garrin Kimmell here at Iowa on the Trellys core language design and prototype implementation, I had occasion to recall a very tricky way in which too liberal datatype definitions can lead to inconsistency, even if we do not assume a terminating recursor for all datatypes.  I believe I learned this trick from Benjamin Wack, although at this point I am not sure.  The code (in our new Trellys core-language syntax) is here:

data Empty -> Type 0 where
{}

data Loop -> Type 0 where
CLoop : (f:((b:Loop) => Empty)) => Loop

prog loop_step : (b : Loop) => Empty

loop_step = \ b . case b [beq] of CLoop f -> f b

prog loop : Empty

loop = loop_step (CLoop loop_step)

Notice that this example does not use any explicit recursion (rec or recnat in Trellys), but yet manages to define empty of type Empty.  It can only do that as indeed it does, by writing an infinite loop.  The trick here is to say that the constructor CLoop for the Loop type takes in a function from Loop to Empty as its argument.  Terminating type theories like Coq do not allow this, nor does the logical fragment of the Trellys core language.  Constructors that take in arguments whose types mention the type being constructed in a negative position must take those arguments in as programmatic arguments, rather than logical arguments.  Trellys distinguishes programmatic terms from logical ones, while allowing the two to interact.  The above example is programmatic, as shown by the use of the thick arrow in several places, and the “prog” keyword for loop_step and loop.  The thick arrow means that the input may be programmatic.  A thin arrow for function types means that the argument can only be logical, not programmatic.  If we replace the thick arrows above with thin ones and drop “prog”, the example will not type check (at least, it won’t as soon as Chris and/or Vilhelm implement the positivity check :-)).

Anyhow, I thought it was worthwhile to share this example, since it shows that datatypes which are fine for programming can be dangerous for logic, even if we do not assume a terminating recursor for them.

I have been meaning to post about this wonderful paper, which you can read here, for quite some time now.  It has sealed a spot as one of my two favorite papers of 2010.  In it, Nao Hirokawa and the incredibly productive Aart Middeldrop (I chaired a session at IJCAR 2010 consisting solely of papers co-authored by Aart — wow) give a new criterion for confluence, based on relative termination.  As the abstract to the paper says:

In this paper we use the decreasing diagrams technique to show that a left-linear term rewrite system R is confluent if all its critical pairs are joinable and the critical pair steps are relatively terminating with respect to R.

I mentioned relative termination in a post earlier this year (which was supposed to provide background for talking a little about this paper, but I didn’t manage to post about it until now).  A term-rewriting system R is relatively terminating with respect to another S iff the relation $\rightarrow_{R/S}$ defined to be $\rightarrow_S^* \cdot \rightarrow_R \cdot \rightarrow_S^*$ is terminating.  This of course implies that R is terminating, since $\rightarrow_R \subseteq \rightarrow_{R/S}$: if we forbid an infinite sequence of R-steps where in between every two R-steps we are allowed also to have a finite number of S-steps, then we are certainly forbidding an infinite sequence of R-steps with no S-steps in between any two R-steps.  But we can easily describe systems R which are terminating but not terminating relative to some other systems S.

The criterion (for left-linear rewrite systems — that is, rewrite systems where no variable is allowed to appear twice on the left hand side of a rule, as in $\textit{eq}(x,x) \rightarrow \textit{true}$) that all critical pairs are joinable and the rewrite system consisting of critical pair steps is relatively terminating with respect to R generalizes the two best known confluence criteria: the Knuth-Bendix criterion, which requires joinability of critical pairs (I’ll explain those in a moment) for terminating rewrite systems; and the orthogonality criterion, which requires the absence of overlaps in left-hand sides of rules, but does not require termination.  A critical pair of two (possibly the same) rewrite rules is the pair of terms you get from overlapping the two rules in a non-trivial way.  For example, consider the single rule $f(f(x,y),z) \to f(x,f(y,z))$, expressing associativity of an operation $f$.  This rule can be overlapped non-trivially with itself by unifying the whole term with the $f(x,y)$ subterm.  In this case, we get two critical pair steps: $f(f(f(x',x''),y),z) \rightarrow f(f(x',x''),f(y,z))$ and $f(f(f(x',x''),y),z) \rightarrow f(f(x',f(x'',y)),z)$.  The critical pair itself consists of the two right hand sides of these rules.  If the critical pair steps are terminating relative to the rewrite system R (and if the set of rules is left-linear), then Hirokawa and Middeldorp have shown that we get confluence.

The reason I am so impressed with this paper is that it generalizes these two well-known confluence criteria.  It generalizes the Knuth-Bendix criterion because it does not require termination (but still requires joinability of critical pairs).  It generalizes the orthogonality criterion because there, since there are no critical pairs, the set of critical pair steps is trivially terminating with respect to the rewrite system (since we have no critical pair steps to take, we cannot possibly diverge taking them).  Also, the proof uses decreasing diagrams, an advanced characterization of confluence in terms of diagrams due to Vincent van Oostrom.  So the technical machinery required is pretty sophisticated.

Below is a picture of Nao which I took right after his presentation, with a diagram he drew illustrating the essential idea of the theorem:

### Background

[This report is by Duckki Oe.] $\texttt{T}^\texttt{vec}$ is dependently typed (see the paper). For example, the vector type <vec T n> is indexed by a type T (polymorphic) and a term n (dependent on terms, so dependently typed).

In a dependently typed system, two syntactically distinct types may deem “equal”. That is not the case in the other systems we discussed in the seminar, in which syntactically distinct types are indeed disjoint. For example, <vec nat 1> and <vec nat 0+1> are supposed to be equal, because those two terms can be equated by evaluating them. So, equality reasoning is a crucial point of dependently typed systems. At the same time, because proving an equality is generally undecidable in a FOL/HOL, proofs are necessary as a part of code and what the code proves are used to cast one type to another.

### Unannotated $\texttt{T}^\texttt{vec}$

The type system of $\texttt{T}^\texttt{vec}$ has two type-level abstractions (Pi and Forall). The motivation of this distinction is that some arguments are only used for equality reasoning or typing, not for computation. A Forall type is a function that takes a specificational argument and the argument can only be used at specificational positions. So, the deal here is that specificational information can be dropped when we reason about the equality of terms.

Also, $\texttt{T}^\texttt{vec}$ sports equality types, which are atomic formulas of the equality predicate and can be introduced by “join” term. A term of an equality type is considered a proof of the equality, and the value can be passed to a function that takes (or assumes) that equality. A join term justifies the equalities of two terms via evaluation. (That means you can reach the same term by evauating them.) The join cares only about the semantic equality of terms, but not types. However, the equality of types can be achieved from the equalities of terms inside those types.

A good thing about join is that we don’t need to worry about typing information and specificational arguments, because they don’t affect computation (evaluation) at all. And because the specificational arguments can get in the way of proofs, it is better not to have them when we do reasoning. So, the paper presents the unannotated $\texttt{T}^\texttt{vec}$, where terms do not have typing information and specificational arguments at all. This idea is very similar to that of the Implicit Calculus of Construction. The consequence is that type assignment/checking over those terms is undecidable. However, that is OK because there is a fully annotated version of $\texttt{T}^\texttt{vec}$ which is decidable and sound with respect to the unannotated version. The meta-theoratic analysis of unannotated $\texttt{T}^\texttt{vec}$ is important because, even in the annotated $\texttt{T}^\texttt{vec}$, the “join” proofs are based on evaluating unannotated terms.

### Annotated $\texttt{T}^\texttt{vec}$

In the annotated version, terms have all the type annotations and specificational arguments, which are distinguished from other (computational) arguments. Marking arguments to be specificaitonal is used in Guru (with the “spec” keyword) and discussed in other papers like Erasure and Polymorphism in Pure Type Systems (and also in ICC*). The annotated version lookes obviously sound because dropping annotations woundn’t do anything wrong for typing, and thus terms with annotation dropped should be typeable.

### Large Eliminations and Inconsistent Assumptions

Large Eliminations are types that are computed by evaluating terms. For example, IfZero n T1 T2 is a type with Large Elimination, where n is a term, and T1 and T2 are types. The type of the whole term is T1 if the term n is evaluated 0, otherwise, T2. That can be used for generic programming and proving.

In the $\texttt{T}^\texttt{vec}$ above, one cannot type a looping term or stuck term. However, with Large Eliminations added, one can give a type to a looping term or stuck term by making inconsistent implicit assumptions. To prevent from evaluating a term with inconsistent assumptions, when annotations are dropped, we can leave unbound lambda abstractions. Until all implicit assumptions are proved, terms are not evaluated (considered values). This treatment preserves the strong normalization property of $\texttt{T}^\texttt{vec}$ with Large Eliminations.

The Trellys project, which I’ve mentioned before, is working to design and implement a CBV functional programming language with dependent types, using community-based processes to take advantage of expertise from the dependent types and the functional programming communities.  Our first PIs’ meeting was June 30th through July 2nd, here at The University of Iowa.  In attendance were:

• Tim Sheard and Ki Yung Ahn from Portland State University.
• Stephanie Weirich, Chris Casinghino, and Vilhelm Sjoeberg (U. Penn., though Vilhelm is visiting Iowa this summer).
• Harley Eades III, Frank (Peng) Fu, Garrin Kimmel (just joining U. Iowa from Kansas U.), Duckki Oe, and myself.
• Adam Procter from U. Missouri (visiting me at Iowa this summer).

We had a very productive meeting, and have managed now, I think, to integrate several language ideas we had b[een exploring up to now: compile-time versus run-time arguments to functions, type-level computation (aka, “large eliminations”), general recursive versus terminating recursive functions, equality types $t_1 = t_2$ interpreted as something like joinability in the operational semantics of the erasures of $t_1$ and $t_2$, and bootstrapping terminating recursion for datatypes from terminating recursion on natural numbers.

At a high-level, the language is divided into two parts: a proof part and a programming part.  These parts are not distinguished syntactically, as I had done previously in Guru, and is done in other systems like Feferman’s System W (referred to in an earlier post on this blog).  Rather, the typing judgment distinguishes which expressions can soundly be viewed as falling into the proof part, and which can only be viewed as programs.  The main judgment is $\Gamma \vdash t : T\, \theta$, which means that $t$ has type $T$ in context $\Gamma$ (as usual); and $t$ is a proof expression if $\theta$ is “proof”, and a program expression if $\theta$ is “program”.  Some expressions can be safely viewed as both proofs and programs: namely, constructive proofs.  Non-constructive proofs cannot be viewed as programs, since we do not wish to give an operational semantics for them (there are well-known computational interpretations of classical logic that would allow us to do this, but it dos not fit our purpose in this language design).  General recursive functions (i.e., ones falling outside the fragment of the language that the typing rules conservatively identify as terminating) cannot be viewed as proofs.  This categorization of expressions as terminating vs. non-terminating and constructive vs. non-constructive leaves open a fourth possibility: general recursive non-constructive expressions.  But I do not see any use for these at the moment.  Indeed, even including non-constructive reasoning in the proof part is already a bit new, since most dependent type theories (with the notable exception of Luo and Adams‘s systems of logic-enriched type theory) are constructive.

To me, the most important feature we nailed down for this language is its support for what I think of as freedom of speech.  Indeed, in an amazing coincidence, the beautiful seminar room in the School of Journalism which we used for our three-day meeting had a large reproduction of the First Amendment of the U.S. Constitution on the wall.  You’ll see it when I get the group pictures from Tim or Stephanie.  Anyhow, the principle of freedom of speech is that it is ok to speak about anything you want; speaking about something is different from actually affirming it.  I can talk about violence without advocating it, for example.  In the Trellys core language, we want to be able to state and prove theorems about general-recursive programs.  So we want to write down a proof $\pi$ about a general-recursive program $P$.  This $\pi$ will, in general, have to mention $P$ — for example, to state that it is non-terminating, or perhaps to prove that it is terminating and enjoys some property.  As I mentioned above, our basic typing judgment in Trellys uses a qualifier $\theta$ to keep track of whether or not the expressions in question are proof expressions or program expressions.  To allow proofs to mention programs and yet still be counted as proofs, some of our typing rules have to be non-strict in this qualifier: some premises of the rule indicate that an expression is a program, and yet the conclusion will still indicate that other expressions are proofs.  An example is one of the application rules (we have several).  If the argument is syntactically a value, then it can be a program, and the application itself can still be a proof (if the function part of the application is a proof).  Restricting to a value here makes it possible to define the semantics of types which fall into the proof part of the language — types which we can soundly view as formulas — without being forced to define a deep semantics for more general types, which could include some pretty hair-raising features in Trellys (type:type, general-recursive computation of type expressions, etc.).  So it is more correct to say that proofs can talk about program values — but these can be quite arbitrary (including functions which would diverge if applied, or which require really rich typing features in order to type check).

So our next steps will be writing up the typing rules in detail (we have an Ott file that already has a lot of the most important details filled in), and starting to work on a prototype implementation, using some cool libraries that Stephanie and Tim have been working with to provide clean and elegant support for data with name-binders, in Haskell.  It’s going to be fun.

[This post is by Tianyi Liang.] In the last week’s graduate seminar, we discussed Harley Eades’ proof of normalization of a predicative polymorphic type theory called Stratified System F. Harley’s proof uses the idea of hereditary substitutions, due to Watkins and Pfenning.

Before the introduction and also in the previous seminar, we discussed some technique for proving normalization of simply typed and polymorphic lambda calculi, which is also known as the Tait-Girard reducibility method. In simply typed lambda calculi, the terms are defined as $x | \lambda x:\phi . t | t t$, while in Stratified System F, we have an extended term definition with type variable abstraction, which is $\Lambda X : K . t$, and term instantiation, which is $t\{ \Phi \}$. In the simply typed one, types are defined as $X | \Phi \rightarrow \Phi$, while in Stratified System F, they are extended with $\forall X : K . \Phi$. Also, in Stratified System F, a kinding relation is introduced, which relates a level to a type with respect to some context, using algorithmic type/kind-checking rules.

To better understand the quantified types, we discussed its convenience: in the simply typed language, we can assign many similar types to the same term, for example: $\lambda x . x : nat \rightarrow nat$, $\lambda x . x : T \rightarrow T$, and so on. With quantification, we may assign the single type $\forall X . X \rightarrow X$ instead. Another example regarding type variable abstraction is that: a term $t$ is defined as follows: $t = \Lambda X: *_2 . \lambda y : X . y$, of type $\forall X : *_2 . (X \rightarrow X)$. Thus, $t \{ \Phi \} =_\beta \lambda y : \Phi . y : \Phi \rightarrow \Phi$. We also discussed the well-formedness of contexts. An example is given: $y:X$ is not well-formed, while $X : *_2 . y : X$ is well-formed.

There are two differences in the kind-checking rules between Harley and Leivant. One is in the variable rule, Harley added the assumption: $p \leq q$; the other is in the forall-type rule, Harley changed the kind level from $max(p,q+1)$ to $max(p,q)+1$.

In our discussion, we put more focus on the proof of lemma 12, which is substitution for the interpretation of types. The lemma itself shows the interpretations of types are closed under substitutions. The hereditary substitution function lies at the heart of the substitution lemma. In fact the substitution function, denoted $[t_2/x]^\Phi t_1$ where terms $t_1$ and $t_2$, type $\Phi$, and free variable $x$ are inputs, can be constructively extracted from the proof of the substitution lemma. The hereditary substitution function for Stratified System F is terminating.

Harley defined the interpretation of types on normal terms and then extended this definition to non-normal terms. Also, Harley mentioned that his definition of the interpretation of types is equivalent to the following: $n$ in $[| \Phi |]_\Gamma$ iff $\Gamma \vdash n : T$, where $n$ is a normal term, $\Gamma$ is a context, and $\Phi$ is a type. The harder thing to observe is that in Harley’s definition of the interpretation of types he didn’t explicitly state that $\Gamma$ is well-formed, but this is inferred from the fact that his definition is restricted to kindable types. Thus, by lemma 1 in his paper, we can infer that $\Gamma$ must be well-formed, which proves that the above definition is equivalent.

Then, we discussed the main result, the type soundness for the interpretation of types: If $\Gamma \vdash t : \Phi$ then $t \in [| \Phi |]_\Gamma$. Theorem 13 shows the type-checking rules are sound with respect to the interpretation of types. We remark that by the extension of the interpretation of types if typing is sound with respect to the interpretation of types then we can conclude that Stratified System F is in fact normalizing.

Since the definition of the hereditary substitution function is so central to the proof, Vilhelm mentioned a style of formalized proof where the lambda expressions are defined as a data typed indexed by their type, and the type system enforces that all functions on them are type preserving. The hope would be that anything we write would be looking like the “extracted substitution function” in the paper, but with the types indexed enough that it itself expresses the proof. However, it wouldn’t be quite so easy, because in addition to the fact that the hereditary function is type preserving, we also need to know that it acts like several reduction steps, and that if there is no redex in the argument there will be no redex in the result. Tracking all that through indexed types seems a bit much, so a formalized development may still need to have some external proofs about it.