Skip navigation

I recently watched a pretty inspiring talk by Avi Press about his experience using Haskell at his startup, Scarf. I enjoyed the whole talk, but something that particularly got me excited was his point that although Haskell is pure, in practice, almost all the code his company was writing ended up being monadic. One simple reason was the need for logging through the codebase. And this really struck a chord with me, because in my experience as a maybe late beginner Haskeller (I would love to discover I am early intermediate, but I doubt it), one of the most frustrating things has been how difficult it can be to debug some pretty large piece of code. Yes, because everything is pure, I could just call functions in ghci, for example, to see if I am getting what I expect. But creating the environment for such calls is a problem — and a well-known one at that, in the world of testing. And unless you want to put every function that you might need logging for into the IO monad (or ok, maybe some more particular monad), you are not going to have access to this very standard debugging practice of printing messages from functions to see what is going on. This is a pretty huge problem for software engineering with Haskell, and it is surprising that there is relatively little said about it.  

So Haskell’s promise of purity is getting badly undercut by this simple need to log. It is great that the language is pure, but if you have to use the IO monad everywhere for logging, then you are greatly reducing the number of pure functions in your codebase, and that is sad.  

From a research perspective, I am working on DCS, which has an even stricter core language than Haskell’s: pure and also uniformly terminating (all functions are statically checked for termination at compile-time using a novel type-based approach). And there, I am also concerned that if it is too hard to fit one’s functions in that core, then again, DCS programs will get overtaken by monadic code (in a Div monad for possibly diverging computations).

Now in both DCS and Haskell, a pretty simple generalization of the notion of purity would seem sufficient to solve this problem. It should be fine to let a program perform computation in the IO monad, as long as one is not allowed to use the value produced by such a computation, in pure code. So I imagine having a safePerformIO function, like this (in Haskell):

safePerformIO :: IO () -> a -> a
safePerformIO m x = unsafePerformIO m `seq` x

The idea is that we use unsafePerformIO under the hood, and seq to make sure the IO actually happens. But a pure computation x cannot possibly depend on what the IO does, since we just use a computation of type IO (), which does not produce a value within the language. It could happen that the IO computation fails somehow, for example by raising an exception. But even pure code in Haskell can do that, so it does not seem to compromise the purity of pure functions. In DCS, the situation would be different, and one would need to have a library of IO functions that cannot fail (or diverge) to use this idea for DCS code. Otherwise, the core language would not be able to guarantee that functions in the core are guaranteed to terminate (because some code used with safePerformIO had failed with an exception or diverged).

For a very simple approach to logging (no logging levels, for example), we can have some code like this:

logInit :: FilePath -> IO Handle
logInit f =
do
h <- openFile f WriteMode
hSetBuffering h NoBuffering
return h

log :: Handle -> String -> a -> a
log h msg x = safePerformIO (hPutStr h msg) x

We have a function to get a handle for logging, and then another that uses safePerformIO to write a message to the log.

I am planning to use this approach to logging in my Haskell codebase for DCS, because it is just so useful to be able to log some information while trying to debug a complex piece of code. And I do not want to put everything into a monad just to do that.

Thanks for reading! Drop me a line at aaron.stump@gmail.com or comment here, if you have feedback or thoughts.

Strong functional programming was proposed by David Turner, a founding father of modern functional programming (FP), as a better version of the functional programming paradigm realized today in Haskell. Where Haskell insists on purity, strong FP further insists that all functions terminate on all inputs. So strong FP goes beyond pure FP, to guarantee an even stronger property than purity, namely termination. In this post I will suggest a practical approach to Turner’s vision, based on monads.

Purity for a programming language means that there are no implicit side effects: whenever a function is called with the same arguments, it must return the same result. By an amazingly clever trick, we can represent impure functions like getTimeOfDay() — this function is so impure that it is a bug if it ever behaves purely — in the pure language Haskell. We view such functions as implicitly taking an extra input representing the entire state of the computer, and implicitly producing an updated version of that state as an extra output. And then we hide this fiction by using a monad, called IO in Haskell, so you never actually have access to the state. You just have various primitives like getChar :: IO Char. This is a character, but accessing it requires you to enter the IO monad, from which there is no way to escape. So in practice, Haskell programs have an impure layer within the IO monad, and then invoke various pure functions that are not in the IO monad. This is because pure code can be injected into the IO monad using the return function of the monad (which does exactly that: bring pure things into IO).

Now, in Turner’s lovely paper linked above, which I quite recommend, there are arguments for the value of strong FP. But Turner does not explain how to make this practical, and maybe is not even thinking about that, since he worries rather a bit about the fact that the restriction to terminating programs only means that a strong FP language will not be Turing complete. But I would like to point out that for practicality, we need to take the same path that Haskell took for purity. Haskell is pure, but uses monads to allow impure computations. What we need for a practical strong FP language is a language that is pure and enforces termination, but can also accommodate — and here I definitely propose monads as the way to go — impure and possibly non-terminating computation.

So I imagine a world where there is a monad, maybe called GenRec, for possibly nonterminating computation, and code from the terminating part of the language can be injected into that monad using its return function. And GenRec computations could be lifted into IO, too. So we can play the same game that Haskell does, but with two special monads (IO and GenRec) instead of one (just IO).

And while we are thinking this way, about teasing apart the structure of the language using monads, I also am wondering about a monad for possibly type-unsound computation. In OCaml, for example, there is this somewhat infamous Obj.magic which lets you change the type of an expression to anything you want. This could lead to segmentation faults and other disasters. But yet it is there, and apparently is useful in some cases, for example when compiling languages like Coq, with a more powerful type system, to tell OCaml’s compiler to just trust us that something is ok.

So what about an ultimate monad called something like Unrestricted (name too long, ok), that allows unsound casts, general recursion, and IO.

Now your language is accommodating all kinds of programs, and providing a range of guarantees. This is just taking the core awesome idea of Haskell, and expanding it both towards more safe (i.e., the terminating part of the language) and also less (i.e., unsound casts).

A final parting shot on this: with layers of monads, one thinks immediately of monad transformers, and then maybe thinks the idea will be too complicated. But I am exploring currently a language design, which I am intending to head in the direction I have just outlined, where we use subtyping to keep some coercions implicit. This could mean that the super-annoying liftings that are needed with monad transformers could be elided completely using subtyping. My language design is already committed to subtyping for one its central features (a type-based form of terminating recursion), and so investigating subtyping as a way to make programming with layers of monads more workable seems very natural.

Thanks for reading!

Everyone hates to define the predecessor function on Church-encoded natural numbers, right?  The wikipedia page has a couple painful options, including Kleene’s solution.  Now, I don’t know how original this is, but here is a simpler one.  Code below checks in Haskell (with ghc), using language extensions TypeApplications , ExplicitForAll , ImpredicativeTypes.  (Thanks to ghc for all those goodies!)  Also, need to hide Maybe, succ, and pred from the prelude.  Full source code with all this is here.

The idea is to define pred to have type

Nat -> Maybe Nat

where these types are defined (in Haskell) like this:


type Nat = forall(x :: *) . (x -> x) -> x -> x

type Maybe a = forall(x :: *). (a -> x) -> x -> x

They are structurally pretty similar anyway, right? Well, assuming zero, succ, just, and nothing (definitions reproduced just below), we can define pred this way, where the @ is for explicit type application in ghc:

pred :: Nat -> Maybe Nat
pred n = 
  n @(Maybe Nat) (\ m -> just $ m succ zero) nothing

We start out with nothing, and then for each iteration (n is a Church Nat, so it iterates its first argument n times starting with its second), we apply the Maybe Nat we are given, to succ and zero. If that Maybe Nat is nothing, then this becomes zero. If that Maybe Nat is just x for some x, then applying to succ and zero gives us succ x. We then wrap this in just, to preserve the Maybe Nat type.

Full source includes a function showPred for testing. Helper functions are:

zero :: Nat
zero s z = z

succ :: Nat -> Nat
succ n s z = s (n s z)

just :: a -> Maybe a
just a j n = j a

nothing :: Maybe a
nothing j n = n

This is the easiest version of predecessor I have seen.

The omega-rule of lambda calculus is the principle that if for all closed terms t we have t1 t beta,eta-equal to t2 t, then t1 is beta,eta-equal to t2. This principle is not valid for all t1 and t2, due to the existence of so-called universal generators, which Gordon Plotkin used in a 1974 paper to construct a counter-example to the omega-rule.

Working on Relational Type Theory (RelTT), I found I would really like a form of the rule, though, which happily is derivable. Suppose that t1 and t2 are closed normalizing terms, and that t1 t =beta t2 t for all closed t. Then indeed t1 =beta t2. The proof is easy. By the assumption about t1 and t2, we have that they are beta-equal to some terms \lambda\,x.t'_1 and \lambda\,x.t'_2, respectively, with t1′ and t2′ normal. So we have that for any closed t, [t/x]t1′ =beta [t/x]t2′. Instantiate this with \Omega, which is (\lambda\,x.x\ x)\ (\lambda\,x.x\ x), and write c1 for [Omega/x]t1′, and c2 for [Omega/x]t2′. Suppose that x is not free in t1′. Then we have t1′ =beta c2. By the Church-Rosser property of untyped lambda calculus, this implies that c2 beta-reduces to t1′, since t1′ is normal. But this can only happen if x is not free in t1′ and t1′ is identical to t2′. Otherwise, c2 would lack any normal form.

Now suppose that x is free in t1′ and in t2′. Again appealing to Church-Rosser, c1 and c2 must both beta-reduce to some common term q (since they are beta-equal). But the only term c1 reduces to is itself, and similarly c2, so they must be identical. By a simple induction on t1′, the fact that c1 is identical to c2 must imply that t1′ is identical to t2′. The crucial point is that in the case where t1′ is x, then t2′ must also be x, because otherwise t2′ would have to be Omega, but this is ruled out because t2′ is normal. This is why the restriction to normalizing t1 and t2 is critical: it allows us to choose Omega as a term that behaves like a fresh variable when substituted into t1′ and t2′. It cannot equal any term that is already in t1′ and t2′. This is a hint of why universal generators — which as near as I can understand are diverging terms that can reach a term containing t for any term t — can be a counterexample to the omega-rule: we cannot find any term that we can safely use as a stand-in for a variable, because all terms are generated.

I wanted to record this proof here, because the proof in Barendregt is quite involved, running for page after difficult page. It covers a much more general (and apparently much harder) situation, where t1 and t2 need not be normalizing. But for many applications in type theory, we have normalizing terms, and there this simple justification for the omega rule can be used.

Well, when you are in one, you cannot get out — and you cannot even tell there is any out to get to. Take our physical universe, for example. We cannot leave it (physically, certainly), of course. And we cannot even perceive there is an end to it (as space is expanding faster than the speed of light). So, it just is all there is, as far as we can physically observe. Amazingly, it is through reason that we can deduce that there is something beyond space, not through observation (although personally I think the big bang is pretty strong observational evidence for extra-universal reality).

In mathematics, we can thinks of sets closed under operations as universes. Take the set of natural numbers. There is no way to get out of this set just using basic operations like zero, successor, arithmetic operations, etc. Or take some greater ordinals, like epsilon-0 (which is the first fixed-point of the ordinal exponentiation function raising omega to a power). Within epsilon-0 (and ordinals are sets, so we can imagine being in it), you can add ordinals and exponentiate omega by them, and you will not leave the universe of epsilon-0. One needs disruptive technology — like forming the set of all unary functions on the naturals, or taking the fixed-point of a continuous function on the ordinals — to blow out of the universe.

Being in a universe means it is hard to imagine you are missing anything. You are like a fish in the ocean. Land, what’s land? Everything is just water. Keep your tall tales!

Now, I raise all this because I have recently had occasion to program comparatively in Cedille (the proof assistant I and my group have developed over the past few years), and Coq. Coq is an amazing tool, with many extraordinary accomplishments and results to its name. Cedille is just a baby tool in comparison. But Cedille’s type theory provides something that to me is closer to a universe. There are natural things you can do, and you just do them uninhibitedly. Ok, it is true that you cannot form negative datatypes. That is hitch, that makes it feel quite a bit less like a universe. In fact, I am working on getting rid of that restriction, in relational type theory (a new system I am working on). But in Coq, there are just so many more pointy edges. Make sure that datatype is strictly positive, not just positive! Be careful you do not perform large eliminations over large-indexed types! (Huh?) The latter caused me a bunch of headaches recently. Don’t recurse on anything unless it is a structural subexpression of a pattern. Picky, picky. Just check out the error list and see if it feels a bit spiny.

But, you will say (putting on your defender-of-Coq hat): those restrictions are all there to prevent things like logical inconsistencies that can arise without them. To which I respond: just the same way traffic lights and stop signs are there to prevent accidents. We need those. But there are no one-way streets in the ocean. You just swim. There are currents, ok. But nothing stops you.

My point is not to put down any tool, especially one as awesome as Coq. My point is to draw attention to a goal I think we should have as type-theory designers and implementers. The type theory must of necessity have some limitations. This is fundamental, due to Goedel’s incompleteness theorem, for one reason. But working within the theory, it should be hard to imagine its limitations. It should feel like you can do anything. You can use the tools the theory provides in any way you want. This is not true in Coq (or Cedille either, given the positivity restriction for datatypes). It feels more like you are a citizen in a modern democracy. You can do a lot of things. But not just anything. I can take a walk. That is ok. I cannot take a walk through my neighbor’s house uninvited. That is not allowed. There are, for very good reasons, loads and loads of rules and restrictions on what you can do. In the human world, we need rules and laws. But in the formal world, we would love to have total freedom.

So let’s strive to design type theories that feel like universes. If we do not not, we will suffer the fate we deserve: to mistake our tangle of ordinances for the ocean.