=====================[ Warm-up ]======================== Explain why this works: ghci> let sq x = x*x ghci> [1,2,3,4,6] >>= return . sq [1,4,9,16,36] Hint: List is a Monad, i.e., return and >>= are defined for it. See https://www.haskell.org/onlinereport/standard-prelude.html (look for "-- Lists") =====================[ Desugaring ]===================== For general desugaring of Haskell, see the very useful post http://www.haskellforall.com/2014/10/how-to-desugar-haskell-code.html and the more fundamental https://serokell.io/blog/haskell-to-core ----------------[ Desugaring warm-up ]------------------ Explain how this works: ghci> let seqq = foldr mcons (return []) where mcons p q = (p >>= (\x -> q >>= \y -> return (x:y))) ghci> seqq [Just 1,Just 2,Just 3,Just 4] Just [1,2,3,4] ghci> seqq [Right 1,Right 2,Right 3,Right 4] Right [1,2,3,4] Explain how this works (this one is a bit harder; work through the types with :t): % cat seqq.hs -- Adapted from http://learnyouahaskell.com/input-and-output --- and using https://en.wikipedia.org/wiki/Mary_Had_a_Little_Lamb seqq = foldr mcons (return []) where mcons p q = p >>= \x -> q >>= \y -> return (x:y) main = do rs <- seqq [getLine, getLine, getLine] print rs % ghc seqq.hs [1 of 2] Compiling Main ( seqq.hs, seqq.o ) [2 of 2] Linking seqq % ./seqq Mary had a little lamb, Its fleece was white as snow And everywhere that Mary went, ["Mary had a little lamb,","Its fleece was white as snow","And everywhere that Mary went,"] Note the syntax of seqq: seqq = foldr mcons (return []) where mcons p q = (p >>= (\x -> q >>= \y -> return (x:y))) This also works: seqq = foldr (\p q -> p >>= (\x -> q >>= \y -> return (x:y))) (return []) seqq is the standard library functions 'sequence', so defined in the classic Prelude (https://www.haskell.org/onlinereport/standard-prelude.html). Its modern definition is somewhat more complex but equivalent for the base cases. --------------------[ end warm-up ]-------------------- The above links do not cover all syntactic sugar---for example, they don't cover the "record syntax", which occurs below, but it's good to get into the desugaring state of mind before you start looking at more monads, such as the all-important State monad. Speaking personally, it is essential for me to break down a language to a lower, simpler layer to understand it. Just as with breaking down C to assembly and Lisp and Ruby to their bytecode, I feel that Haskell needs to be broken down to a simpler model to make sense to me. Core is not the simplest layer, but it serves---and it's closer to lambda calculus than any of the above. =====================[ The State Monad ]===================== The Real World Haskell (RWH) book leads up to this monad through a lengthy series of examples such as a parser (in Ch. 10) and a Logger for a regexp example from Ch. 8 (the Logger can stand on its own, as it actually has little to do with that example; Logger is a variant of Haskell's generic Writer). We chose different examples of monadic programming in class, so you can read these chapters in a different order, skimming over these examples (but review them later in detail, as they make important points in favor of monadic programming). The Maybe monad nicely illustrates one can chain operations that depend on the results of previous operations, all of which can fail---without peppering the code with checks for the special failure value. That code is supplied automatically by the generic definition of >>= , the chaining operator. Note that if you use the "do" syntactic-sugar notation, that operator may not even appear in the code, but it's still there, and still supplies the "just pass Nothing through" logic. The Maybe monad is helpfully reviewed here: http://book.realworldhaskell.org/read/monads.html#id641078 The IO Monad also chains its actions, but demonstrates a different idea: separation between pure functions that don't depend on the state of the OS (from and to which IO happens) and IO-bound functions (which depend on the state of the world from which they take input, or change it by writing output, or both). You can always tell this from the type of the function. -----------[ An aside: can this be important in practice? ]----------- The latter may seem trivial---after all, aren't IO functions few and obviously findable by tools like grep, let along CodeQL or Joern?---but it really isn't. You see, the bits derived from potentially hostile external inputs may occur far down the line of function calls that deal with modern protocols, without a clear understanding on what assumptions about their validity have been checked and are yet to be checked, at any given point in the code. This leads to trouble even in the cases of complete memory safety, e.g., https://portswigger.net/research/http2 in the modern clouds that are all garbage-collected Java and similar. So tracking the specific dependencies on input turns out to be super-important for security even in the absence of memory corruption. ------------------------[ end side note ]------------------------ The State monad abstracts these ideas further, to pass not just a single result value but _any state_ (i.e., some collection of values, separate from the result) along the chain, also providing a type discipline to flag the functions that depend on this state. Depending on your programming application, you may care more for for the former or the latter aspect of the State monad, but it abstracts them both naturally. ---------------[ Obstacles to understanding ]--------------- You can start reading about the State Monad here: http://book.realworldhaskell.org/read/monads.html#monads.state but there are a few syntax complications that may get in the way (well, they did get in my way as I was reading it, while looking at the GHC's implementation). Here they are: Unfortunately, the syntax of GHC's State monad's definition is loaded with complexity and syntactic sugaring. As a result, even though the State monad is not much more more complex than Maybe, reading its actual definitions is hard. You will need to read the following first to get to the bottom of it: 1) Record syntax: what it means, how it is used. http://stackoverflow.com/questions/21725734/haskell-record-syntax-desugared https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-540003.15.3 It's important to remember that when you use the record syntax to define a type you actually get a _function_ named as the "record member". It acts on the objects of the type, and returns whatever type you declare for it (which can be thought of as extracting a named record member from a record). But it's still a function, and you may NOT, in fact, use the same name in a different record in your scope---because the function definitions will collide. This is very much unlike the behavior of actual records in other languages, and there are proposals to change Haskell to fix this. This part of Haskell is definitely constructed rather than discovered. For a few hints about the history of record syntax---which is something of a historical accident, though useful---see http://stackoverflow.com/questions/5367167/haskell-record-syntax 2) Newtype: why is it used? https://wiki.haskell.org/Newtype pay special attention to the pattern-matching of the _undefined_ value (of the "bottom" type _|_) in the Examples (section 4) by newtype vs data. The whole point of newtype is to be able to match the bottom (undefined) with the pattern including the data constructor (like "Foo3 _" with in the y3 example). A type defined with _data_ will not be able to do it. You can think of this from the implementation angle as follows. When you define types with _data_, their constructors will create an extra wrapping data structure around their arguments (if there are several constructors, a tagged union would be a likely C implementation). For _newtype_, no such wrapping occurs, and pattern-matching thus works differently (and is able to match _undefined_ with a pattern like "Foo _", which works like a bare "_" pattern for them). This becomes important if you want to use a data constructor in pattern matches but don't want the additional wrapping---as is the case with State. ---------------[ State monad, built from scratch ]--------------- Having seen all that, you are ready to read about the State monad: http://book.realworldhaskell.org/read/monads.html#monads.state This section first builds its own SimpleState implementation without much of the syntactic sugar. I found this implementation much easier to understand. Then, in "Will the real state monad please stand up?", the chapter comes back to the actual implementation, which uses newtype to create a pattern-match-able wrapper around the function type inside the state, and record syntax for actually defining that function (runState). These naming choices complicate the matter, and seem to be driven by stylistic preferences. Just as with the choice of name for "return" (in fact, "place an object into the monad" rather than the traditional procedural meaning), one might wish for different names, but it's important to understand existing code. Note that this constructor, State, wraps not what you'd think of as "state" (a collection of values somehow affected by successive operations, to be passed along a chain of such operations), but a *function* that acts on such a collection (of any arbitrary type, represented in the definitions by a mere type variable "s" with no typeclass qualifications) and somehow produces a result (again, of whatever kind, represented a mere type variable "a") and a new state (of the *same* type as "s", this is the only real restriction. In other words, "State" here represents a "stateful action", with a "result" and "affected state" considered separately. The >>= and >> operators chain these actions. This makes sense: one would want to chain actions rather than states themselves (and what would that mean?) What the chaining means is _entirely_ dependent on the implementation of >>= for any particular type. For the Maybe monad and RWH parser and Logger examples, as well as for the random number generator example, it means just passing the state along, with appropriate modifications in the latter cases; for the List monad example (http://book.realworldhaskell.org/read/monads.html#id641620), it means nested loops, and looks very similar to list comprehensions---which, indeed, desugar to monads! (see http://www.haskellforall.com/2014/10/how-to-desugar-haskell-code.html) For the IO monad, lazy reads make it even more interesting. So you don't know what the chaining of of actions will do---on the contrary, you get to redefine it as you like. The "programmable semicolon" (http://book.realworldhaskell.org/read/monads.html#id642960) section highlights this point, too. Note also that the State type constructor has _two_ type parameters, "s" and "a", whereas a Monad has only one. So, what is really a Monad---as the text explains---is (State s), for any "s", with the "s" fixed. But notice how much more natural "returnAlt" looks compared to "returnSt", perhaps proving again that a partial application is a much preferred and intuitive way to create lambdas (and named functions), once the concept of the lambda gets internalized. ---------------[ Monads and the Continuation Passing Style ]--------------- CPS is a hard style to write imperative-style code in; the CPS transformation is hard to do by hand. The textbook stresses this, and the Chicken Scheme compiler you saw illustrates this rather well. Monadic programming makes writing in CPS natural, since the second argument to >>= is, in fact, a continuation, which definitions of >>= typically call just before they are done. Look through the definitions on >>= for various types (including State) to see this. ---------------[ Other tutorials ]--------------- I found the RWH's approach the best. Still, there are some tutorials that make other fine point: http://brandon.si/code/the-state-monad-a-tutorial-for-the-confused/ (https://acm.wustl.edu/functional/state-monad.php seems to be dead; let me know if you find a saved copy) if you find any others helpful, please let me know! ---------------[ GHC code ]--------------- https://hackage.haskell.org/package/mtl-2.2.1/docs/src/Control-Monad-State-Class.html#MonadState https://hackage.haskell.org/package/transformers-0.5.2.0/docs/src/Control.Monad.Trans.State.Lazy.html#State https://hackage.haskell.org/package/base-4.9.0.0/docs/src/Data.Functor.Identity.html#Identity If you wonder if this is still somehow theoretically clean, the answer seems to be "not really": http://www.cse.chalmers.se/~nicsma/no-state-monad.html