Why do type theories or some advanced programming languages need a termination analysis? What benefit arises from knowing that certain terms are terminating and certain functions are total? There are two main benefits, which can be summed up rather cryptically as:
1. because we want to run those programs.
2. because we don’t want to run those programs.
Let me try to explain these puzzling statements.
In type theories and in programming languages with dependent types, it is possible for types to contain terms. An example that has been used a lot is that of a list type where the type itself specifies the length of the list. So instead of just have the type <list A>, for lists of elements of type A, we have <list A n> for lists of length n of elements of type A. The type itself tells us the length n of the list. What is the benefit of having the length of the list be part of the type? Now we can express relationships between the lengths of input and output lists in list-manipulating functions. The frequently used example is the type for the append function:
append : <list A n> –> <list A m> –> <list A (n+m)>
This type expresses that the length of the output list is the sum of the lengths of the input list, a nontrivial property of the append function.
Notice that now we have code (namely n+m) inside a type. For many languages with dependent types (but not the Guru language I work on), we must now be very careful. During type checking, it often happens that we need to check whether or not two types are equal. The notion of equality used for type theories like the Calculus of Inductive Constructions (CIC), implemented in the widely used Coq tool, includes evaluation of terms that appear in the types. For example, in CIC, the type <list A (2+2)> will be considered equal to the type <list A (1+3)> and to the type <list A 4>. Testing equality of types thus requires running code during type checking (so, at compile time rather than run time). This leads us finally to the first main reason for using a termination analysis: we want to make sure that the code we run when testing equality of types is always going to terminate. If we allow general-recursive functions to be called in types, testing type equality becomes undecidable. If all terms which appear in types are guaranteed to terminate, then we can retain decidability of type equality. So we need termination analysis for certain terms “because we want to run them”, in particular, at compile time, during type-equality testing.
The second reason dependently typed languages need termination analysis for certain terms is “because we don’t want to run them”. Let me explain now this puzzling phrase. The Curry-Howard isomorphism is a remarkable correspondence between types and formulas in constructive logic. For the most basic example, a constructive proof of the formula “P –> Q” is a function that takes in a proof of P and returns a proof of Q. So the implication “P –> Q” of constructive logic corresponds to the function type “P –> Q” in typed lambda calculus. There is a lot more to be said about the Curry-Howard isomorphism than this basic point, of course, but this is already enough to explain the second main use of termination analysis. We can only view total functions from P to Q as proofs of the implication P –> Q. If we have a function of type P –> Q that might not terminate on some inputs, then we cannot view this function as a proof that P implies Q. Otherwise, we could prove invalid statements like “True –> False” using functions that do not terminate. We need to know that for any input of type P, the function would indeed terminate with an output of type Q. So we need termination analysis because we do not want to run this function — indeed, for the stronger reason that we cannot confirm the function terminates on all inputs just by running it, if the domain type P has infinitely many elements.
So these are the main reasons we need termination analysis: in many type theories we want to know that code which appears in types is guaranteed to terminate, so we can run it during type-equality checking; and we want to view total functions from P to Q as proofs that P implies Q. Of these two reasons, the second is the most pressing. We can always weaken our type equality to make sure that it is decidable. For example, in Guru, type equality does not include any evaluation of terms; evaluation of terms is instead done in a more controlled manner where the user indicates which terms to evaluate (and a global cutoff can be imposed if necessary to ensure decidability). But there is no getting around the fact that if we want to interpret (soundly!) functions as proofs of implications, then we need to know those functions are total.
In my next post on termination analysis, I will start commenting on some relatively recent papers from the literature, on termination analysis.