Date: Fri, 13 Jan 2012 23:24:23 -0500
Subject: (>>=) associativity, short-circuiting, and lazy computation

Hallo all,

First, sorry for the confusion and errors in class today on $subject.  Not
quite sure how I botched that, but I did.  In any case:

This mail contains a lot of by-hand lambda calculus in hopes of shedding
light on details in apology for said flubbing of basics in class today.
People with only limited exposure to lambda calculus reasoning are
encouraged to read the first half, but they may prefer, rather than the
second half, instead to play Alligator Eggs
http://worrydream.com/AlligatorEggs/ and save this mail for a rainy day.
People eager to understand the material in detail are encouraged to try
variants of the examples presented.

Let's consider the Maybe monad, with three functions
> foo, bar :: Int -> Maybe Int
> foo = const Nothing
> bar = Just

> test :: Maybe Int
> test = Just 5 >>= foo >>= bar >>= baz

First, we should note that >>= is declared to be "infixl 1", meaning that it
is _left_ associative and binds tightly.  So mistake number 1: the AST is in
fact left branching; indeed, as Jason rightly pointed out, attempting to
right-associate it is a type error.

We can study the behavior of this expression with some equational reasoning.
Recall that, for Maybe,
> Nothing  >>= _ = Nothing
> (Just a) >>= f = f a
which is the same as
> ma >>= f = case ma of
>                 Nothing -> Nothing
>                 Just a -> f a
which means that demanding the result of >>= is going to demand the left so
that it can case analyse it.  Thus, demanding the whole expression
> ((Just 5) >>= const Nothing) >>= Just
and thus
> (Just 5) >>= const Nothing
and thus
> Just 5
even though we are going to throw away the result in
> const Nothing 5

If you don't like equational reasoning, observe that
> leftEval = error "Yes" >>= const Nothing >>= Just
throws the exception, meaning that the first step of computation runs even
though the later steps are certain to fail.  This raises an important point
which I should have remembered in lecture: just because you might be able to
prove that the result is independent of some computation doesn't mean that
that computation never runs; the particular proof search strategy employed
by lazy reduction may take a different route.

Note, however, that the right-hand sides of >>= do not evaluate once the
computation has failed (as we expect from the definition of >>=).  While not
short-circuiting per-se, it has some superficial similarities (and is, I
think, what tripped me up):
> Just 5 >>= const Nothing >>= error "Bang" >>= Just
reduces to "Nothing" without raising an error.  It is not, however,
short-circuited: redefining >>= to be
> Nothing  >>= _ = trace "no" Nothing
> (Just a) >>= f = trace "yes" $ f a
yields the sequence of traces "yes", "no", "no" before returning Nothing.
We therefore see that the Nothing is repeatedly analysed rather than being
properly short-circuited.

Now, all of that said, if you _want_ a short-circuited Maybe monad, never
fear.  wren ng thornton, a JHU alum and past co-worker (who does indeed
prefer his name rendered in lower case), has shown us the way with
Control.Monad.MaybeK (available as part of unification-fd on Hackage; the
docs are available at
http://hackage.haskell.org/packages/archive/unification-fd/0.5.0/doc/html/Control-Monad-MaybeK.html
).  (For those in the know, this is the "continuation-passing" form of the
computation.)

It involves a higher-rank type, which we have not covered yet (sorry;
fortunately, here, it need not concern us excessively):
> newtype MaybeK a = MK (forall r. (a -> Maybe r) -> Maybe r)

The equivalent of Just is "return", and the equivalent of Nothing is
> MK (const Nothing)
(the definition of mzero for MaybeK, just as Nothing is the defintion of
mzero for Maybe; we have not yet covered the MonadPlus class.)

The eliminator of a MaybeK, (i.e. the analog of runReader and runState from
lecture today), which makes a Maybe out of it, is:
> runMaybeK :: MaybeK a -> Maybe a
> runMaybeK (MK m) = m return
Note that that "return" is the _Maybe_ monad's return: since m has type
"forall r . (a -> Maybe r) -> Maybe r", the return there must be
instantiated at "a -> Maybe a", making the whole right-hand side have type
"Maybe a" as required.

And the Monad instance is
> instance Monad MaybeK where
>    return a   = MK (\k -> k a)
>    MK m >>= f = MK (\k -> m (\a -> case f a of MK n -> n k))
This looks more promising: >>= does not analyse m and the only case analysis
happening here is to evaluate the _rest_ of the computation, f.

So, reconstructing test to use MaybeK:
> test  = ((Just 5) >>= const Nothing) >>= Just
> testK = ((return 5) >>= const (MK $ const Nothing)) >>= return
and indeed
> runMaybeK $ testK
yields Nothing as we expect.  Now, let's chew on testK for a moment:
> return 5 = MK (\k -> k 5)

> (return 5) >>= const (MK $ const Nothing)
>
> MK (\k -> k 5) >>= const (MK $ const Nothing)
>
> MK (\k -> (\k' -> k' 5) (\a -> case (const (MK $ const Nothing) a) of MK n -> n k))

> (return 5) >>= const (MK $ const Nothing) >>= return
>
> MK (\k -> (\k' -> k' 5) (\a -> case (const (MK $ const Nothing) a) of MK n -> n k))
>    >>= \x -> MK (\k -> k x)
>
> MK (\k -> (\k' -> (\k'' -> k'' 5) (\a -> case (const (MK $ const Nothing) a) of MK n -> n k'))
>           (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n k))

OK, everybody got that?  Note the structure of the lambdas; in schematic
form, we started with
> A >>= B >>= C
and we now have
> (\k -> (\k' -> A (B k')) (C k))
or
> (\k -> A (B (C k)))
which is right-branching as desired.  If B ignores the rest of the
computation (and indeed, it does), then only A and B are evaluated and the
rest of the computation is truly short-circuited.

In detail, then, expanding defintions and reducing:
> runMaybeK testK
>
> (\k -> (\k' -> (\k'' -> k'' 5) (\a -> case (const (MK $ const Nothing) a) of MK n -> n k'))
>        (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n k))
>   return
>
> (\k' -> (\k'' -> k'' 5) (\a -> case (const (MK $ const Nothing) a) of MK n -> n k'))
> (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return)
>
> (\k'' -> k'' 5) (\a -> case (const (MK $ const Nothing) a) of MK n ->
>                          n (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return))
>
> (\a -> case (const (MK $ const Nothing) a) of MK n ->
>          n ((\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return))) 5
>
> case (const (MK $ const Nothing) 5) of MK n ->
>   n (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return)
>
> case MK $ const Nothing of MK n ->
>   n (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return)
>
> (const Nothing) (\a -> case (\x -> MK (\k' -> k' x)) a of MK n -> n return)
>
> Nothing

There.  As you can see (phfew!), we took the left-associative >>= and made
the computation short-circuiting:  the "\x -> MK (\k' -> k' x)" return at
the right of the computation was never considered, because, more
importantly, the bind before it (the "case ... a of MK n -> n return") was
also never considered.  Note that the computation really does work its way
from the left; leftEval rewritten to use MaybeK still throws the exception.

As another check, redefining >>= to trace:
> showmk mk = case runMaybeK mk of { Nothing -> "Nothing" ; Just _ -> "Just" }
and
>   x@(MK m) >>= f = trace (showmk x ++ " >>=") $ MK (\k -> m (\a -> case f a of MK n -> n k))
(Sorry for not mentioning @-patterns in class either; it just lets me get a
variable for the whole thing, x, at the same time as case analysing to get
m;  I'll make a slide about it when it comes up in class.)
yields "Just >>=" and "Nothing >>=" before terminating with the answer
Nothing.

Hope that helps and that there aren't more mistakes in this email.
--nwf;