Datatypes are a conundrum for designers of type theories and advanced programming languages. For doing formal reasoning, we rely heavily on datatype induction, and so datatypes seem to be essential to the language design. And they are a terrible nuisance to describe formally, adding a hefty suitcase to the existing baggage we are already coping with. For example, in the “Sep3″ Trellys design we have been developing for the past 12 months now, we are contending with a programming language with Type:Type and dependent types, plus a distinct predicative higher-order logic, with syntax for proofs, formulas, and predicates. This language is getting a bit big! And then one has to describe datatypes, and you need an Ott magician to make it all work (fortunately we do have one, but still).  What a hassle!

Can’t we just get rid of those darn datatypes?  The Pi-Sigma language of Altenkirch et al. and recent proposals of McBride et al. are out to do this, or at least minimize what one needs to accept as a primitive datatype.  In this post, I want to talk about the problems you encounter if you lambda-encode your datatypes, and why I think we might be able to get around these in Sep3 Trellys.

There are two terrible snarls that we face with lambda-encoding datatypes, and they both have to do with eliminations (that is, with the ways we are allowed to use data that we have or assume have already constructed).  The first snarl is multi-level elimination.   Suppose you want to allow the construction of data by recursion on other data (standard), together with construction of types by recursion on data (exotic: “large eliminations”).  Then you have a big problem right away with using lambda-encoded data, because lambda-encoding with either the Scott or Church encoding requires fixing a level for elimination of your data.  Let me say more about this, looking for the moment just at the well-known Church encoding.  In this encoding, a numeral like 2 is encoded as $\lambda s. \lambda z. s (s z)$.  I am writing this Curry-style (without typing annotations) in the hopes of making it more readable.  The type you want to assign to this term is then $\forall A:\textit{Type}. (A \rightarrow A) \rightarrow A \rightarrow A$.   The numeral 2 has been encoded as its own iterator: given a function $s$ of type $A \rightarrow A$ for some type $A$, and a starting point $z$ of type $A$, the numeral 2 will apply $s$ twice starting with $z$.  The problem with this encoding is that it is forced to pick a level of the language for $A$.  Here, we said that $A$ is a type, so values computed by 2 must be at the term level of the language.  So we cannot use this encoding of 2 if we want to compute at the type level.  For that, we would need the type $\forall A:\textit{Kind}. (A \rightarrow A) \rightarrow A \rightarrow A$, and that is quite bad since if we are going to have impredicative quantification over kinds as well as types, we will lose normalization (I don’t have a precise reference for this, but heard it through the type theory grape vine, attributed to Thorsten Altenkirch, that impredicativity is a trick we  can only play once.)   Anyhow, even if we had such impredicativity, we would need to assign multiple types to this encoded 2, and that sounds very worrisome: users are carefully crafting annotations to show a single typing, and it would become quite wild, a priori, to be crafting annotations to achieve multiple typings.

The second snarl is dependent elimination.   If we want to use Church-encoded 2 for writing dependently typed functions, or (as is the same under the Curry-Howard isomorphism) proving theorems, the simple polymorphic type we just considered is not good enough.  From an encoded natural number $n$, we want to produce data of type $C\ n$, for some type constructor $C$.  So instead of $\forall A. (A \rightarrow A) \rightarrow A \rightarrow A$, we really want something like $\forall C. (\forall n. (C n) \rightarrow (C (n+1)) \rightarrow (C\ 0) \rightarrow \forall n. (C\ n)$.  But this looks problematic, as we are trying to define the $\textit{nat}$ type itself!  What types can we possibly give to $C$ and $n$ in that definition?  Their types need to use $\textit{nat}$ itself, which we are trying to define.  Now we could use recursive types to try to get around that problem, but the situation is actually even worse: the pseudo-type I wrote is supposed to be something like the definition for $\textit{nat}$.  That is, we would expect our encoding of 2 to have that type.  But then we see that even as a pseudo-type, it does not make sense.  We actually want 2 to have the type $\forall C. (\forall n. (C n) \rightarrow (C (n+1)) \rightarrow (C\ 0) \rightarrow (C\ 2)$.  That is, its type should mention 2 itself!  In general, we want a natural number $N$ to have type $\forall C. (\forall n. (C n) \rightarrow (C (n+1)) \rightarrow (C\ 0) \rightarrow (C\ N)$.  How can something’s type mention the thing itself?  This reminds me of very dependent function types (see Section 3 of Formal Objects in Type Theory Using Very Dependent Types by Jason Hickey, for example).  It seems pretty exotic.

Now in the context of our Sep3 Trellys design, there is actually a little light here through all these complex difficulties for lambda encodings.  First of all, we have Type : Type in the programming part of Sep3.  So we could hope that the problem with multi-level elimination would go away.  It actually doesn’t with our current design, because we have adopted a predicative higher-order logic on the logic side, which has actually forced us to add a universe hierarchy on the programming side.  So we have Type 0 : Type 0, and then Type 0 : Type 1 : Type 2 …  Now it occurs to me that all this trouble would just go away, and the language design would get much simpler, if we just bit the bullet and adopted an impredicative higher-order logic.  I immediately lose a lot of confidence that our logical language would be normalizing — but that’s what we do careful metatheoretic study for.  There’s no reason to assume that impredicativity will cost us consistency, so let’s not be pessimistic.

If we do not stratify the logical side of the language, we need not stratify the programming side, and so we won’t have multiple levels at which to eliminate an encoded piece of data.  This would be great, as it would (almost) knock down the first objection to lambda-encoding datatypes.  There’s another issue one would have to worry about, though, and that is eliminating data in proofs (since Sep3 distinguishes proofs and terms).  I am going on a bit too long, and my thinking isn’t settled yet, but there’s one very nice thing we could hope for about eliminating data in proofs.  Instead of using the Church encoding, we could use the Scott encoding.  Here, 2 is encoded as $\lambda s. \lambda z. s\ 1$.  Data are encoded as their own case-statements, rather than as their own iterators.  Why is this relevant?  Well, I have to note first that now we really will need recursive types of some kind for this, because intuitively, we want the (simple) type for Scott encoded 2 to be $\forall A.(\textit{nat} \rightarrow A) \rightarrow A \rightarrow A$.  Note the reference to $\textit{nat}$ in the type assigned for $s$.  We need to use a recursive type to allow $\textit{nat}$ to be defined in terms of itself like this.  But here’s the cool observation: Garrin Kimmell and Nathan Collins proposed for Trellys (both Sep3 and U. Penn. Trellys) that we base induction on an explicit structural subterm ordering in the language.  We have an atomic formula $a < b$ for expressing that the value of terminating $a$ is structurally smaller than the value of terminating $b$.  The obvious but beautiful observation is that with the Scott encoding, we really do have $1 < 2$.  Note that this is not true with the Church encoding, since $\lambda s. \lambda z. s\ z$ is not a subterm of $\lambda s. \lambda z. s\ (s\ z)$.

Ok, I’ve gone on too long and haven’t gotten to address the problem with dependent elimination yet.  But suffice it to say that lambda-encoding datatypes, which was always eminently attractive, is starting to look like a possibility for Sep3.

1. Hello Aaron,

For the fact that impredicativity does not “stack”, this fact is a consequence of the inconsistency of Girards system U- which does exactly that: integrate imprediative polymorphism at the kinding level. A very nice proof of this fact, due to Coquand, involves encoding the proof that impredicativity is not set-theoretical within the type theory itself, see:

T. Coquand, A New Paradox in Type Theory http://www.cse.chalmers.se/~coquand/nyparadox.ps

Of interest might be: Geuvers, Induction is not derivable in second-order dependent type theory, which is a negative result about church encodings of natural numbers similar to the one you mention, but it is preceded by an interesting discussion about how to potentially encode inductive types with an interesting dependent inductive principle in an impredicative type theory. I don’t know what happens with his “clever encoding” in the case with universes (his proof obviously breaks down in that case).

2. Thanks a lot for the references, Cody! I am particularly interested in taking a look at the article by Geuvers you mentioned.

3. I could also point you to this discussion on Oleg’s blog:

About representations of datatypes in lambda-calculus. This might be less interesting for your needs, but i think it bears reminding that the church natural numbers are not adequate computationally: there is no way to compute the predecessor of a church natural number in constant time.

I would also like to expand on the comment above: H. Geuvers explains that if we define Nat to be:

Nat = forall X. X -> (X -> X) -> X

And 0 and successor defined in the usual way

Then the type

forall P: Nat -> Prop, P 0 -> (forall m. P m -> P (S m)) -> forall n, P n

is not inhabited in the CoC.

If we define

Ind n = forall P: Nat -> Prop, P 0 -> (forall m. P m -> P (S m)) -> P n

And then define

Nat_1 = exists x:Nat, Ind x

Then Geuvers says:

forall P: Nat_1 -> Prop, P 0 -> (forall m. P m -> P (S m)) -> forall n, P n

Is uninhabited as well.

In Type : Type, this type is of course inhabited as are all types, but the question becomes, can it be inhabited by something that has the correct computational behaviour? If not is there some variation of the trick which makes this work?

• I’m not sure how well known this is, but it is fairly easy to show for various elementary datatypes that parametricity (and possibly extensionality) implies strong induction principles for Church encodings. For instance:

forall R. R -> R

is provably uniquely inhabited by appeal to parametricity, and one can also define strong elimination of dependent pairs. I don’t think I’ve ever managed to work through the proof for Church numerals, but I expect it’s possible for those, as well (I have done binary sums).

One problem is finding a computational interpretation of parametricity, though. Jean Philippe Bernardy has done some work on this with microAgda, and wrote a paper I believe. But I remember being a little perplexed by the results; parametricity was given some sort of proof irrelevant interpretation, which I didn’t expect since it appears to enable us to do computation of some sort. But I won’t claim my intuition for the area is particularly well developed.

The other issue is that this is still no solution to the problem of how to do large elimination, which you really want to prove things like disjointness of constructors.

• Dan Doel
• Posted April 3, 2012 at 8:39 pm

As an addendum: I suppose the answer to the parametricity thing may be: it doesn’t allow us to do any additional computation, just prove that the computation we can already do has a stronger type. But I don’t really know.

• I must admit that I am incapable of finding the proof of the “strong elimination” principle for the church numerals, given the parametricity theorem (and extensionality). I also am incapable of solving the problem (of proving the strong elimination principle with the right computational properties) when placing myself in the Type:Type theory, even with increasingly complex definitions of Nat.

I would be very grateful if you could send any solutions my way if you find them :)…

The only thing i remember when i perused Bernardy’s papers and blog posts is how much it reminded me of Harper and Licata’s “Canonicity for 2-dimentional Type Theory”. I would also like to know if this is just a coincidence…

• Dan Doel
• Posted April 4, 2012 at 11:27 am