=====================[ Warm-up ]========================

Explain why this works:

ghci> let sq x = x*x
ghci> [1,2,3,4,6] >>= return . sq
[1,4,9,16,36]

Hint: List is a Monad, i.e., return and >>= are defined for it. See
  https://www.haskell.org/onlinereport/standard-prelude.html (look for "-- Lists")

=====================[ Desugaring ]=====================

For general desugaring of Haskell, see the very useful post
http://www.haskellforall.com/2014/10/how-to-desugar-haskell-code.html
and the more fundamental https://serokell.io/blog/haskell-to-core 

----------------[ Desugaring warm-up ]------------------

Explain how this works:

ghci> let seqq = foldr mcons (return []) where mcons p q = (p >>= (\x -> q >>= \y -> return (x:y)))
ghci> seqq [Just 1,Just 2,Just 3,Just 4]
Just [1,2,3,4]
ghci> seqq [Right 1,Right 2,Right 3,Right 4]
Right [1,2,3,4]

Explain how this works (this one is a bit harder; work through the types with :t):

% cat seqq.hs
-- Adapted from http://learnyouahaskell.com/input-and-output
---  and using https://en.wikipedia.org/wiki/Mary_Had_a_Little_Lamb

seqq = foldr mcons (return []) where mcons p q = p >>= \x -> q >>= \y -> return (x:y)

main = do
    rs <- seqq [getLine, getLine, getLine]
    print rs

% ghc seqq.hs  
[1 of 2] Compiling Main             ( seqq.hs, seqq.o )
[2 of 2] Linking seqq

% ./seqq
Mary had a little lamb,
Its fleece was white as snow
And everywhere that Mary went,

["Mary had a little lamb,","Its fleece was white as snow","And everywhere that Mary went,"]

Note the syntax of seqq:
  seqq = foldr mcons (return []) where mcons p q = (p >>= (\x -> q >>= \y -> return (x:y)))

This also works:

seqq = foldr
       (\p q -> p >>= (\x -> q >>= \y -> return (x:y)))
       (return [])

seqq is the standard library functions 'sequence', so defined in the classic Prelude
  (https://www.haskell.org/onlinereport/standard-prelude.html). Its modern definition
  is somewhat more complex but equivalent for the base cases.

--------------------[ end warm-up ]--------------------

The above links do not cover all syntactic sugar---for example, they don't
cover the "record syntax", which occurs below, but it's good to get into the
desugaring state of mind before you start looking at more monads, such as
the all-important State monad.

Speaking personally, it is essential for me to break down a language
to a lower, simpler layer to understand it. Just as with breaking down C
to assembly and Lisp and Ruby to their bytecode, I feel that Haskell
needs to be broken down to a simpler model to make sense to me. Core
is not the simplest layer, but it serves---and it's closer to lambda
calculus than any of the above.

=====================[ The State Monad ]=====================

The Real World Haskell (RWH) book leads up to this monad through a lengthy
series of examples such as a parser (in Ch. 10) and a Logger for a
regexp example from Ch. 8 (the Logger can stand on its own, as it
actually has little to do with that example; Logger is a variant of
Haskell's generic Writer). 

We chose different examples of monadic programming in class, so you
can read these chapters in a different order, skimming over these
examples (but review them later in detail, as they make important
points in favor of monadic programming).

The Maybe monad nicely illustrates one can chain operations that
depend on the results of previous operations, all of which can
fail---without peppering the code with checks for the special failure
value. That code is supplied automatically by the generic definition
of >>= , the chaining operator.  

Note that if you use the "do" syntactic-sugar notation, that operator
may not even appear in the code, but it's still there, and still
supplies the "just pass Nothing through" logic.

The Maybe monad is helpfully reviewed here:
http://book.realworldhaskell.org/read/monads.html#id641078

The IO Monad also chains its actions, but demonstrates a different
idea: separation between pure functions that don't depend on the
state of the OS (from and to which IO happens) and IO-bound
functions (which depend on the state of the world from which they
take input, or change it by writing output, or both). You can
always tell this from the type of the function.

-----------[ An aside: can this be important in practice? ]-----------

The latter may seem trivial---after all, aren't IO functions few and
obviously findable by tools like grep, let along CodeQL or Joern?---but
it really isn't. You see, the bits derived from potentially hostile
external inputs may occur far down the line of function calls that deal
with modern protocols, without a clear understanding on what assumptions
about their validity have been checked and are yet to be checked, at
any given point in the code. This leads to trouble even in the cases
of complete memory safety, e.g., https://portswigger.net/research/http2
in the modern clouds that are all garbage-collected Java and similar.
So tracking the specific dependencies on input turns out to be
super-important for security even in the absence of memory corruption.

------------------------[ end side note ]------------------------

The State monad abstracts these ideas further, to pass not just
a single result value but _any state_ (i.e., some collection of
values, separate from the result) along the chain, also providing
a type discipline to flag the functions that depend on this state. 

Depending on your programming application, you may care more for
for the former or the latter aspect of the State monad, but it
abstracts them both naturally.

---------------[ Obstacles to understanding ]---------------
 
You can start reading about the State Monad here:

http://book.realworldhaskell.org/read/monads.html#monads.state

but there are a few syntax complications that may get in the way
(well, they did get in my way as I was reading it, while
looking at the GHC's implementation). Here they are:

Unfortunately, the syntax of GHC's State monad's definition 
is loaded with complexity and syntactic sugaring. As a result,
even though the State monad is not much more more complex than 
Maybe, reading its actual definitions is hard.

You will need to read the following first to get to the bottom of it:

1) Record syntax: what it means, how it is used. 

http://stackoverflow.com/questions/21725734/haskell-record-syntax-desugared
https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-540003.15.3

It's important to remember that when you use the record syntax to
define a type you actually get a _function_ named as the "record
member". It acts on the objects of the type, and returns whatever type
you declare for it (which can be thought of as extracting a named
record member from a record). But it's still a function, and you may
NOT, in fact, use the same name in a different record in your
scope---because the function definitions will collide. This is very
much unlike the behavior of actual records in other languages, and
there are proposals to change Haskell to fix this. 

This part of Haskell is definitely constructed rather than discovered.
For a few hints about the history of record syntax---which is something
of a historical accident, though useful---see
http://stackoverflow.com/questions/5367167/haskell-record-syntax

2) Newtype: why is it used?  

https://wiki.haskell.org/Newtype pay special attention to the
pattern-matching of the _undefined_ value (of the "bottom" type _|_)
in the Examples (section 4) by newtype vs data.  The whole point of
newtype is to be able to match the bottom (undefined) with the 
pattern including the data constructor (like "Foo3 _" with in the y3
example). A type defined with _data_ will not be able to do it.

You can think of this from the implementation angle as follows.  When
you define types with _data_, their constructors will create an extra
wrapping data structure around their arguments (if there are several
constructors, a tagged union would be a likely C implementation). 
For _newtype_, no such wrapping occurs, and pattern-matching thus
works differently (and is able to match _undefined_ with a pattern
like "Foo _", which works like a bare "_" pattern for them). 

This becomes important if you want to use a data constructor in
pattern matches but don't want the additional wrapping---as is the
case with State. 

---------------[ State monad, built from scratch ]---------------

Having seen all that, you are ready to read about the State monad:

http://book.realworldhaskell.org/read/monads.html#monads.state

This section first builds its own SimpleState implementation without
much of the syntactic sugar. I found this implementation much easier
to understand.

Then, in "Will the real state monad please stand up?", the chapter
comes back to the actual implementation, which uses newtype to create
a pattern-match-able wrapper around the function type inside the
state, and record syntax for actually defining that function
(runState).

These naming choices complicate the matter, and seem to be driven by
stylistic preferences. Just as with the choice of name for "return"
(in fact, "place an object into the monad" rather than the traditional
procedural meaning), one might wish for different names, but it's
important to understand existing code.

Note that this constructor, State, wraps not what you'd think of
as "state" (a collection of values somehow affected by successive
operations, to be passed along a chain of such operations), but
a *function* that acts on such a collection (of any arbitrary type,
represented in the definitions by a mere type variable "s" with
no typeclass qualifications) and somehow produces a result (again,
of whatever kind, represented a mere type variable "a") and a new
state (of the *same* type as "s", this is the only real restriction.

In other words, "State" here represents a "stateful action", with
a "result" and "affected state" considered separately. The >>= and >>
operators chain these actions. This makes sense: one would want
to chain actions rather than states themselves (and what would
that mean?) 

What the chaining means is _entirely_ dependent on the implementation
of >>= for any particular type. For the Maybe monad and RWH parser and
Logger examples, as well as for the random number generator example,
it means just passing the state along, with appropriate modifications
in the latter cases; for the List monad example
(http://book.realworldhaskell.org/read/monads.html#id641620), it means
nested loops, and looks very similar to list comprehensions---which,
indeed, desugar to monads! (see http://www.haskellforall.com/2014/10/how-to-desugar-haskell-code.html)
For the IO monad, lazy reads make it even more interesting.

So you don't know what the chaining of of actions will do---on
the contrary, you get to redefine it as you like. The "programmable
semicolon" (http://book.realworldhaskell.org/read/monads.html#id642960)
section highlights this point, too.

Note also that the State type constructor has _two_ type parameters,
"s" and "a", whereas a Monad has only one. So, what is really a
Monad---as the text explains---is (State s), for any "s", with the "s" 
fixed. But notice how much more natural "returnAlt" looks compared to 
"returnSt", perhaps proving again that a partial application is a much
preferred and intuitive way to create lambdas (and named functions),
once the concept of the lambda gets internalized.

---------------[ Monads and the Continuation Passing Style ]---------------

CPS is a hard style to write imperative-style code in; the CPS
transformation is hard to do by hand. The textbook stresses this,
and the Chicken Scheme compiler you saw illustrates this rather well.

Monadic programming makes writing in CPS natural, since the 
second argument to >>= is, in fact, a continuation, which definitions 
of >>= typically call just before they are done. Look through the
definitions on >>= for various types (including State) to see this.

---------------[ Other tutorials ]---------------

I found the RWH's approach the best. Still, there are some
tutorials that make other fine point:
  
http://brandon.si/code/the-state-monad-a-tutorial-for-the-confused/

(https://acm.wustl.edu/functional/state-monad.php seems to be dead;
  let me know if you find a saved copy)

if you find any others helpful, please let me know!

---------------[ GHC code ]---------------

https://hackage.haskell.org/package/mtl-2.2.1/docs/src/Control-Monad-State-Class.html#MonadState

https://hackage.haskell.org/package/transformers-0.5.2.0/docs/src/Control.Monad.Trans.State.Lazy.html#State

https://hackage.haskell.org/package/base-4.9.0.0/docs/src/Data.Functor.Identity.html#Identity

If you wonder if this is still somehow theoretically clean, the answer seems
to be "not really": http://www.cse.chalmers.se/~nicsma/no-state-monad.html