The functor of infinite lists is representable by natural numbers

Recently I found this blog post about representable functors in Haskell. The example section lacks on detailed explanation on the infinite list functor. Thus, I decided to prove that infinite lists are representable by the type of natural numbers.

Basic definitions

We begin with basic definitions of natural numbers

> data Nat = Zero | Succ Nat

and infinite lists (also called streams)

> data Stream a = Cons { head :: a , tail :: Stream a }

The instance for Functor is now straightforward:

> instance Functor Stream where
>   fmap f (Cons x xs) = f x `Cons` fmap f xs

The stream of all natural numbers is defined as follows.

> nats :: Stream Nat
> nats = Zero `Cons` fmap Succ nats

The isomorphism between Stream a and Nat -> a

To prove that the stream functor is representable by natural numbers we need to establish an isomorphism between Nat -> a and Stream a, which is defined by alpha and beta as follows.

> alpha :: (Nat -> a) -> Stream a
> alpha f = fmap f nats
> beta :: Stream a -> (Nat -> a)
> beta xs = \n -> xs!!n

The function beta uses the usual index-based accessor:

> (!!) :: Stream a -> Nat -> a
> stream !! Zero     = head stream
> stream !! (Succ m) = tail stream !! m

We need to show the isomorphism law for alpha and beta:

beta . alpha = id = alpha . beta

beta . alpha = id

  (beta . alpha) f
= beta $ fmap f nats
= \x -> fmap f nats !! x 

Two functions are semantically equal if their behaviour on every input equals. Thus we prove by induction that

(\x -> fmap f nats !! x) n  

equals f n for every natural number n and an arbitrary function f :: Nat -> a.

Base case: n = Zero

  (\x -> fmap f nats !! x) Zero 
= fmap f nats !! Zero
= fmap f (Zero `Cons` fmap Succ nats) !! Zero
= (f Zero `Cons` fmap f (fmap Succ) nats) !! Zero
= f Zero

Induction step: n = Succ m

  (\x -> fmap f nats !! x) (Succ m)
= fmap f nats !! Succ m
= tail (fmap f nats) !! m     
= tail (fmap f (Zero `Cons` fmap Succ nats)) !! m
= tail (h Zero `Cons` fmap f (fmap Succ nats)) !! m
= fmap f (fmap Succ nats) !! m
= fmap (f . Succ) nats !! m     -- induction hypothesis
= (f . Succ) m
= f (Succ m)

alpha . beta = id

  (alpha . beta) xs
= alpha $ \n -> xs !! n
= fmap (\n -> xs !! n) nats

Two streams are equal if their values on every position equal. Hence, we show that

xs!!i = (fmap (\n -> xs !! n) nats) !! i

holds for every natural number i.

Base case: i = Zero

  fmap (\n -> xs !! n) nats !! Zero   
= fmap (\n -> xs !! n) (Zero `Cons` fmap Succ nats) !! Zero
= ((\n -> xs !! n) Zero `Cons` fmap (\n -> xs !! n) (fmap Succ Nats)) !! Zero
= (\n -> xs !! n) Zero
= xs !! Zero

Induction step: i = Succ j

  fmap (\n -> xs !! n) nats !! Succ j
= fmap (\n -> xs !! n) (Zero `Cons` fmap Succ nats) !! Succ j
= ((\n -> xs !! n) Zero  `Cons` fmap (\n -> xs !! n) (fmap Succ nats)) !! Succ j 
= fmap (\n -> xs !! n) (fmap Succ nats) !! j					
= fmap ((\n -> xs !! n) . Succ) nats !! j
= fmap (\n -> xs !! Succ n) nats !! j	
= fmap (\n -> tail xs !! n) nats !! j -- induction hypothesis
= tail xs !! j
= xs !! (Succ j)							 

How lazy evaluation works in Haskell (by example)

This blog post is an answer to a question asked by Pat Shaughnessy in his excellent talk about Functional Programming and Ruby. His question circa at 24:00 in the video was about how Haskell internally implements lazy evaluation. I will answer this question by explaining what internally happens, when one evaluates the expression take 3 [5..] in Haskell. To avoid confusion with the integer numbers I changed the numbers compared to the example in the talk a bit.

First of all we need to clear up what [x..] really means. Due to the documentation of typeclass Enum the term [x..] is just syntactic sugoar for enumFrom x, which is defined as follows:

> enumFrom :: Int -> [Int]
> enumFrom x = x : enumFrom (x+1)

In Haskell we reason about computation as starting with a term and applying function definitions, until we reach a term we cannot simplify anymore. The process of simplifying the term is called reduction, whereas every single step is a reduction step, which I denote with the symbol =>.

Let us try to reduce the term enumFrom 5 by applying the function definition of enumFrom:

   enumFrom 5
=> 5 : enumFrom 6
=> 5 : 6 : enumFrom 7
=> 5 : 6 : 7 : enumFrom 8
=> ...

Obviously the reduction of enumFrom 5 will not terminate. So why does take 3 (enumFrom 5) terminate? Since reduction is about applying function definitions we need to take the definition of take into account:

> take :: Int -> [a] -> [a]
> take 0 _      = []
> take n []     = []
> take n (x:xs) = x : take (n-1) xs

The function take is defined by three equations, which handle different cases of parameter values and structure.

Now let us reduce the term take 3 (enumFrom 5). In an eager language we are evaluating the function arguments before applying the function definition, that is why eager evaluation is also called innermost evaluation. Considering Haskell to be eager our reduction would look like this:

   take 3 (enumFrom 5)
=> take 3 (5 : enumFrom 6)
=> take 3 (5 : 6 : enumFrom 7)
=> take 3 (5 : 6 : 7: enumFrom 8)
=> ...

Since the reduction of enumFrom 5 diverges, the reduction of take 3 (enumFrom 5) using innermost evaluation will not terminate. So there must be a neat trick preventing Haskell to run into that infinite loop. The trick is that Haskell is a lazy language, which means it always applies the outermost reduction.

What is the outermost reduction step we can apply in the term take 3 (enumFrom 5)? We can apply the function definition of take. So, let us try to find the right pattern to match in our function definition of take. In order to to this we are trying to match each equation of the definition with the given parameters in the function call. Since we handed in the paramter 3 the first equation will not match. What is the difference between equation two and three? In equation two the list is empty whereas in equation three it is not. Thus to decide which equation of take to apply, we need to find out whether the argument list (here: enumFrom 5) is empty or not. So we take a look at it:

   enumFrom 5
=> 5 : enumFrom 6

Recall that we are just looking whether the list is empty or not, we are not trying to compute the whole resulting list.

We just discovered that our argument list is build of a value 5 and a rest list enumFrom 6. This leads us to the fact to apply the third equation of take, since the argument list is not empty. We have to match the pattern take n (x:xs) to our term take 3 (enumFrom 5). The n matches 3, x matches 5 whereas xs matches enumFrom 1. Our equation says that take n (x:xs) will be reduced to x : take (n-1) xs, thus:

   take 3 (enumFrom 5)
=> 5 : take 2 (enumFrom 6)

This yields to the following reduction of take 3 (enumFrom 5), since the definition of take is always the outermost reduction to apply:

   take 3 (enumFrom 5)
=> 5 : take 2 (enumFrom 6)
=> 5 : 6 : take 1 (enumFrom 7)
=> 5 : 6 : 7 : take 0 (enumFrom 8)

We now reached a point, when applying the definition of take terminates our reduction, because take 0 of any list reduces to the empty list []. Thus the last reduction step is:

   5 :6 : 7 : take 0 (enumFrom 8)
=> 5 : 6 : 7 : []

Note that the term 5:6:7:[] is just the list [5,6,7]. This yields us to the whole reduction of take 3 [5..]:

   take 3 (enumFrom 5)
=> 5 : take 2 (enumFrom 6)
=> 5 : 6 : take 1 (enumFrom 7)
=> 5 : 6 : 7 : take 0 (enumFrom 8)
=> 5 : 6 : 7 : []

It turned out that Haskell can cope with infinite lists by default, because of the outermost (or lazy) evaluation. Hopefully this post cleared up things a bit. Do not hesitate to comment.

Fixing git author data in retrospect

Usually if one gets started with git not everything is set up correctly and thus the repository data may be malformed, i.e. the author name or email isn’t correct. Changing this data in retrospect is possbile as the following short bash script shows:


git filter-branch --env-filter '

if [ "$GIT_COMMITTER_NAME" = "pascal_old" ]
if [ "$GIT_AUTHOR_NAME" = "pascal_old" ]

export GIT_AUTHOR_NAME=$an


M2E “Plugin execution not covered by lifecycle configuration”

If you want to use the maven plugin M2E for eclipse, one usually gets an error message of the following kind:

  Plugin execution not covered by lifecycle configuration:org.scala-tools:maven-scala-plugin:2.15.2:compile (execution: default, phase: compile)

A fast resoluton is to add a plugin execution filter to section org.eclipse.m2e. Thus my error was removed by adding the following lines to my pom.xml:

                        <ignore />

One has to add one plugin execution filter for every error message.