In type theory and logic, predicativity is the property that quantifications do not create objects of the same class as the variable being quantified over. So one cannot quantify over formulas and get a formula. Instead, one gets a formula’ — some class distinct from formulas. In type theory, predicative quantification means that an expression which is quantifying over types is not itself a type (Haskell works like this, which surprised me). Often (for example, in both Coq and Agda), one sees a countable hierarchy of type universes: a quantification over Type(x) lives in universe Type(x+1). Impredicativity allows one to stay at the same level. So in Coq, which has an impredicative universe called Prop at the bottom of the universe hierarchy, quantifications over Prop are still in Prop. From a type theory perspective, impredicative polymorphism is a quantification of types which still gives you a type, of the same kind that can be used for instantiating such quantifications. Indeed, it is the possibility of instantiating a quantifier with the entire quantified formula itself that makes analysis of impredicativity challenging. Challenging, but well understood: Girard discovered the way to prove normalization in the presence of impredicative polymorphism for his System F (see his co-authored book Proofs and Types).

Forget squabbles over comment characters (I do think I prefer Haskell/Agda’s now over OCaml’s — sorry!): people can really tangle over predicativity versus impredicativity. I have heard second-hand that a well-known type theorist is against impredicativity because he distrusts a trick that can only be used once. This refers to the fact that we cannot have a layered language with impredicativity in two layers: Girard proved this leads to inconsistency, with his analysis of System U; see the discussion by Coquand here. This is why in Luo’s Extended Calculus of Constructions (ECC), which is part of the type theory of Coq, only the base level of the universe hierarchy is impredicative, and the other levels are predicative.

I have to say I find this criticism, that we should distrust tricks that can only be used once, unconvincing. For impredicativity is a heck of a trick. The best part is that it enables lambda encodings. I am on a personal quest to rehabilitate lambda encodings, and I am working on new type theories that support dependently typed programming and proving with lambda encodings. A first step along these lines is my work with Peng Fu on self types, which you can read about here. This is not the place to go into all the details, but impredicativity is absolutely essential. With only predicative quantification, there really is no way to make lambda encodings practical.

Furthermore, if impredicativity is the trick that can only be used once (and hey, let’s use it), predicativity is the trick you have to keep using again and again and again. To avoid repeating code and datatype definitions at each level of the hierarchy (which one quickly finds will be needed for a lot of practical examples), we have to resort to level polymorphism. Now a level-polymorphic type is not in any level we can access. Why not just extend to higher ordinals? Oh boy. And of course, we have to solve level constraints that the type system imposes. This complicates type checking.

So predicativity is out for me, including ECC-style predicative hierarchies over impredicative base levels. But impredicativity is not getting off so easily. Not for funny philosophical reasons or because we have to use it carefully — and we are treading the line here, as the more expressive we make our logic the closer we skirt the edge of inconsistency — but because it too has its own peculiar restrictions. I have just been learning about these, and indeed, they are the impetus for writing this post. Did you know that you cannot perform large eliminations on so-called impredicative datatypes in Coq? And that these datatypes are the reason that Coq enforces strict positivity for datatype declarations (the datatype cannot be used to the left of an arrow, even if to the left of an even number of arrows, in the type for an argument to a constructor of that datatype)? Well, if you are reading this post you probably did. But I did not. The counterexample is from a 1988 conference paper by Coquand and Paulin-Mohring. Their presentation is a bit hard to read through for me, anyway, but thankfully Vilhelm Sjöberg transcribed it into Coq, with helpful comments, here.

Reading through this example has been challenging, not just because it is tricky and I still did not manage to get a great intuitive grasp of how it works. But also because I have been worried it might apply to the type theory I am developing as the next step in the work on dependently typed lambda encodings. And that would be very troubling, as I have a 20-page proof of consistency of that system! Did I mess up my proof? What features of ECC + inductive types are actually needed for that example of Coquand and Paulin-Mohring’s? As Vilhelm wrote it, the example does use the predicative hierarchy of ECC, not just Prop. I am not 100% convinced that it could not be carried out in Prop alone, though.

After much puzzling, I think I understand why this example would not affect the system I am working on, which allows impredicative nonstrictly positive datatypes, and for which I believe I will soon have an extension with large eliminations. This sounds miraculous, because Coq forbids nonstrictly positive datatypes, and also forbids large eliminations of impredicative datatypes. I am saying you could have both simultaneously and be consistent. That must be wrong, right? Well, wrong (maybe). The difference is that I am working in a Curry-style type theory, similar to the Implicit Calculus of Constructions (ICC) of Miquel (with several additional features for dependently typed lambda encodings). As an aside, while I have the highest admiration for Miquel’s work, his system has added several strange typing rules that prevent a reasonable metatheory and violate a number of aesthetic guidelines for type theory. My system does not follow Miquel on those points — though we do not yet have the definable subtyping which is the most intriguing practical point of ICC, albeit to my knowledge not yet capitalized on in any system or even other papers. In the system I am working on, terms are simply the terms of pure untyped lambda calculus. All the action is in the types we can assign to such terms, and the kinding of those types. For practical purposes, one will eventually need to design an annotated version of such a language, since inferring a type for a pure term of lambda calculus, even just in System F (a subsystem of my system), is undecidable. But for metatheoretic analysis, this Curry-style development, where terms are completely unannotated, is fine. In fact, it is more than fine, it is enlightening. ECC is based on a Church-style approach to the syntax of terms. Types are really legitimate parts of terms, that could be computed at run-time (though nothing interesting at run-time could be done with them). This, by the way, is another point of ICC that I do not agree with: types can be parts of terms “really”, even though they do not have to be.

I am locating the central issue with the Coquand/Mohring counterexample to large eliminations with nonstrictly positive inductive types in the fact that the element of the inductive type in question can store a type (actually, something of type ). This type is really like a piece of data inside the element of the inductive type. It can be retrieved and processed, and this leads to the paradox. Rather than blocking large eliminations with inductive types, I propose to shift the ground completely by using a Curry-style theory. So types cannot be stored inside elements of datatypes, and thus cannot be retrieved and manipulated. I believe this would block that counterexample, while still allowing both nonstrictly positive inductive types and large eliminations over impredicative inductive types. I have a proof that this works for the first of these features, and I believe I shall have a proof for the second (large eliminations) in the next few months.

Hoping to have stirred the pot, I wish anyone reading to this point all the best in their investigations of type theory, or whatever other subjects deprive them of sleep and fire their minds.

## 13 Comments

I think this makes a lot of sense categorically, an actual sigma type (T : Type) * A(T), which is a coproduct, where you can get out the T has to be in Type+1, but when you want to restrict yourself to the subcategory that corresponds to Type it seems that you have to take the colimit instead (at least in some models, like weak existentials in parametric models of System F) and that essentially makes the T field unavailable for large elimination.

Thank you for this positive comment, saizan. I wish I knew categorical semantics enough to draw on it for ideas, as you have done!

Don’t know which well known guy said this about not trusting impredicativity because it cannot be repeated but I would partially agree with him/her. Even if you have impredicativity you need still a predicative hierarchy. And then it is at least weird if certain constructions work only at the lowest level. Let’s say you use impredicative encodings to define inductive datatypes like List. Then you can form List Nat : Type(0) but you cannot define List at level 1 to form List Type. 😦

Anyway this hasn’t much to do with trust but with weirdness.

The trust issue is more related that I like to do things which I can understand intuitively. Inductive datatypes, predicative universes etc all make sense. But I don’t really understand impredcative universes because they seem circular. It seems that having only one of them is sound but I haven’t understood why. Michael Rhatjen seemed to suggest that we can understand impredicativeity as the limit of predicative theories. But he is very clever and I am not, so I rather trust what *I* can understand.

As far as encoding datatypes impredicatively goes this seems to be the dual of an explanation. I mean I understand inductive datatypes, I don’t understand impredicative universes, and you tell me inductive datatypes can be reduced to an imprediative universe. Great! 🙂

I propose not to use a predicative hierarchy at all. For type-level data structures (which I think I personally never used), I would rather use a term-level data structure with codes for types, and then use a large elimination to compute a type from this data structure as needed. Maybe this will not work in practice — but I have done type-level programming but not ended up needing data structures with types in them so far. This could certainly be lack of experience in that area on my part.

About explanations, I think the quest to understand deeply how these objects can be conceived is quite worthwhile. But if we have a sound type theory that allows us to use abstractions for which we do not yet have such an understanding, is that a problem? Maybe it can help us gain the understanding which you are highlighting that should be sought. If we try them out in a type theory we know is consistent, then maybe we will gain insights. My goal is not to have a really powerful foundation, beyond what we understand. My goal is to have a very simple foundation, that will make things like mechanized metatheory (for that foundation itself) more tractable.

Regarding you work with Peng Fu, have you already worked out a translation into System S for indexed containers? Do you thing it would be also possible to encode inductive-recursive types?

Thanks for your question, Alexander. Sadly, I do not know what indexed containers are! Can you enlighten me? About induction-recursion: I read the note by Capretta recently posted to the dependent types Reddit (see http://www.cs.nott.ac.uk/~vxc/publications/induction_recursion.pdf). This is very cool. One could always use his idea to go from induction-recursion to CIC, and then from there to something like System S. It would also be interesting to see if a more direct translation were possible — but I haven’t thought about that yet.

Indexed containers provide a very general and well-behaved notion of generalized W-types that subsumed mutual inductive definitions, “small” induction-recursion and alike, see strictlypositive.org/indexed-containers.pdf, http://www.cs.nott.ac.uk/~txa/publ/ICont.pdf, https://personal.cis.strath.ac.uk/conor.mcbride/pub/SmallIR/SmallIR.pdf. I strongly assume that System S of yours is capable of representing them, but seeing a formal translation recipe would help a lot.

Induction-recursion in its large form (suitable for defining types of type codes closed under Pi-type formation, also known as internally defined Tarski-universes) is not yet completely understood, but this seems to be a major breakthrough in this area: https://pigworker.wordpress.com/2015/01/01/irish-induction-recursion/.

If your System is capable of representing these also, you’d embrace the whole diversity of sophisticated inductive dependent types known up to date.

The only notion of type formation not covered by these I’m aware of, is the notion of very-dependent types by Hickey, an explosive mixture of inductive and Pi-types, how about expressing them?

I’ve missed a source: http://www.cs.swan.ac.uk/~csetzer/articles/fibred_lics2013.pdf. Here one develops a notion of fibered datatypes, which is a direct extension of indexed containers, also subsuming both induction-recursion and indexed induction-recursion. In contains a very fundamental open question: it’s not clear whether fibered types are closed under composition: the breakthrough blog entry I posted above (the one about IRish codes by Conor McBride) strongly suggests that they are, and additionally there is a notation for them which is a lot easier to understand and use.

Having non-strictly data types in impredicative Prop is no problem. But I guess you want to have them also in the whole predicative Type hierarchy above Prop? That could be challenging, I’d be interested to see your proof.

As I was saying, I plan not to have a predicative hierarchy above the impredicative universe (shocking, I know). For those applications — and I would love a pointer to a paper that describes some, as I am ignorant of what these are — that use data structures containing types, I propose to use large eliminations with term-level data structures containing codes for types.

Note that the tricky part for Coq is not the nonstrictly positive datatypes, but the large eliminations for impredicative datatypes. This issue shows up in practice, it seems. For example, Capretta runs into it in the note of his posted recently to the dependent types reddit, on implementing induction-recursion in CIC. I am saying I believe such eliminations could be allowed, if one shifted to a Curry-style approach to typing, rather than Coq’s Church-style. Likely (if this indeed works), the result could be mapped back to a different restriction for Coq (something like, large inputs to constructors are not accessible via pattern-matching on the constructor application), rather than the disappointingly artificial prohibition on large eliminations for impredicate datatypes.

Hi, Alexander. Thanks for the links and explanation about indexed containers. They sound quite powerful — I will have to take a look further. I have not been aiming to subsume all the intriguing forms of advanced datatypes that are out there. It is true that since the systems I am considering are impredicative, there is a lot of expressive power there. Perhaps this can be used for some of these examples. It has not been my focus, though. About very dependent function types, I conjecture these can be done in the follow-on system to System S that I have been working on. In this successor system, self types are replaced by dependent intersection types (introduced by Kopylov), along with a new form of recursive type. Kopylov’s paper talks about very dependent function types, but does not claim they are reducible to dependent intersections. To me it seems they are quite close, as the very dependent function type { f | x : A -> B[f,x]} could be rendered as iota f : T1 -> T2 . A -> B [f , x], where I write iota x : A . B for the dependent intersection type; and where this translation does require there to be some other typing T1 -> T2 for f, such that the uses of f in B make sense.

LMAO, what can you do when your “base theory” is hopelessly hobbled?

You might be interested in reading Jouko Väänänen about the “proper” foundations of mathematics:

http://www.math.helsinki.fi/logic/people/jouko.vaananen/jvaaSO.html

It seems the unbridled power set axiom is the root of all evil (and of the Kleene & Levy hierarchies 🙂 )

Thanks for the reference!