Monadic programming

Introduction

Monadic programming is about computation on values of the form 'a monad which support at the minimum two functions:

val return : 'a -> 'a monad
val bind : 'a monad -> ('a -> 'b monad) -> 'b monad

This kind of computation is heavily used in the Haskell language, but OCaml supports some extensions designed to make the monads use easier. The use of monads is required in some environment (lwt based applications), and be extremly convenient to deal with error handling (see option and result).

Some monadic types may also have some added functions : both and map and sometimes many more.

Many monad can be defined. We will present in this page the following monads:

Note: this page provides multiple OCaml interactions using utop. The Jane street's Base or Core library is used. This library has some slightly different conventions than Stdlib.

The list monad

In order to use the list monad, we have to open the Base or Core module from Jane Street.

The first monadic function is the List.return which returns a list with the single element given as an argument to the function:

# #require "base";;
# open Base;;
# List.return 42;;
- : int list = [42]

The second monadic function is the List.bind. It takes a list and a function, calls the function for each item of the list and returns the concatenation of all the lists returned by the function. It is a synonymous of List.concat_map.

# List.bind [1;2;3] ~f:(fun x -> [x; x*10; x*100]) ;;
- : int list =  [1; 10; 100; 2; 20; 200; 3; 30; 300]

bind can also be written (>>=):

# List.([1;2;3] >>= (fun x -> [x; x*10; x*100]));;
- : int list = [1; 10; 100; 2; 20; 200; 3; 30; 300]

This form makes it easier to chain multiple transformations of the list items.

We can also define a let* operator with the same semantic as bind but a different syntax.

# let (let*) l f = List.bind l ~f ;;
val ( let* ) : 'a list -> ('a -> 'b list) -> 'b list = <fun>
# let* x = [1; 2; 3] in [x; x*10; x*100] ;;
- : int list = [1; 10; 100; 2; 20; 200; 3; 30; 300]

If we want to process two lists, we can use the both function which returns the list of all pairs made by taking an item from the first list and an item from the second list.

# List.(Let_syntax.Let_syntax.both [1; 2; 3] [1; 10; 100] >>= fun (a,b) -> return (a*b));;
- : int list = [1; 10; 100; 2; 20; 200; 3; 30; 300]

Like with (let*), we can have an advantage to define a (and*) token (note: we can use two nested let* to provide the and* behavior):

# let (let*) l f = List.bind l ~f ;;
val ( let* ) : 'a list -> ('a -> 'b list) -> 'b list = <fun>
# let (and*) l f = List.Let_syntax.Let_syntax.both ;;
val ( and* ) : 'a list -> 'b list -> ('a * 'b) list = <fun>
# let* a=[1; 2; 3] and* b=[1;10;100] in List.return (a*b);;
- : int list = [1; 10; 100; 2; 20; 200; 3; 30; 300]
# let* a=[1; 2; 3] in let* b=[1;10;100] in List.return (a*b);;
- : int list = [1; 10; 100; 2; 20; 200; 3; 30; 300]

We can see that the List.map function can be created with bind and return, but the (>>|) operator as a List.map synonymous is also available:

# List.([1; 2; 3] >>= fun x -> return (10*x)) ;;
- : int list = [10; 20; 30]
# List.(let* x=[1; 2; 3] in return (10*x)) ;;
- : int list = [10; 20; 30]
# List.([1; 2; 3] >>| fun x -> 10*x) ;;
- : int list = [10; 20; 30]

Note: we can open List.Monad_infix which brings (>>=) and (>>|).

The option monad

The option monad is similar with the list monad:

# #require "base";;
# open Base;;
# Option.return 42;;
- : int option = Base.Option.Some 42
# Option.(Some 42 >>= fun x -> return (x*2));;
- : int option = Base.Option.Some 84
# Option.(Some 42 >>| fun x -> x*2);;
- : int option = Base.Option.Some 84

But it can be handy because there are many functions which return an option value and we want to get its embedded x value if it matchs Some x.

Then we can imagine:

# let (let*?) opt f = Option.bind opt ~f ;;
# let*? a = Hashtbl.find my_hash my_key in
  let*? b = Hashtbl.find my_hash my_key' in
  Option.return (a*b) ;;

Here, without having to extract explicitly embedded values from the results of both Hashtbl.find functions, we can use them. And the bind operator makes the result equal to None if any of the function returns None.

The result monad

The result monad has the same behavior as the option monad. The differences are that Some is replaced by Ok and the None value is replaced by an Error err value which carry the error cause.

Using this monad is interesting when using multiple functions which return a result type. This monad dispense to check each return and automatically abort a call sequence when an Error is returned.

The function Option.to_result (Stdlib) or Result.of_option (Base or Core) can also be used to convert an option result to option. Then the following program can be used.

# let (let**) opt f = Result.bind opt ~f ;;
# let** a = Result.of_option ~error:"first key not found"
    (Hashtbl.find my_hash my_key) in
  let** b = Result.of_option ~error:"second key not found"
    (Hashtbl.find my_hash my_key') in
  Result.return (a*b) ;;

This convertion is handy to tag an error with some information. It may be required to mix functions which returns option and result.

The lwt monad

Values of type 'a Lwt.t represent computations which can take some time (especially when input/output interactions are involved) and returns a result of type 'a. Like the list monad from the Jane Street Base or Core module, we have the return, map, bind, and both functions. We have also many functions which return a such computations.

The idea is to build a computation which can involve sequences and concurrent sets of atomic computations and let a scheduler executes all of them. (The lwt scheduler can't create threads, the parallelism is limited to IO queries).

The smaller example of a computation is Lwt.return:

# #require "lwt";;
# #require "lwt.unix";;
# Lwt_main.run @@ Lwt.return 42;;
- : int = 42

Note: When utop detects a Lwt.t type, it schedules the corresponding computation with Lwt_lwt.run. Then in a utop session, we could also simply type Lwt.return 42. If this behavior can be handy, it can be misleading since the actual type of Lwt.return 42 would not be printed.

We can also chain multiple lwt computations with the bind operator (here, reading a line, then printing it):

# Lwt.( Lwt_main.run
    (Lwt_io.read_line Lwt_io.stdin >>= Lwt_io.write Lwt_io.stdout));;
let's try
let's try- : unit = ()

This can be rewritten:

# let (let*) = Lwt.bind;;
val ( let* ) : 'a Lwt.t -> ('a -> 'b Lwt.t) -> 'b Lwt.t = <fun>
# Lwt_main.run (let* line = Lwt_io.read_line Lwt_io.stdin in
                Lwt_io.write Lwt_io.stdout line);;
let's try
let's try- : unit = ()

However, lwt provides some extensions with the lwt_ppx preprocessor and provides a similar syntax:

Lwt_main.run (let%lwt line = Lwt_io.read_line Lwt_io.stdin in
              Lwt_io.write Lwt_io.stdout line)

The principle is the same with the (let*) operator, but ppx extensions are more powerful, and here, lwt_ppx provides also try%lwt, for%lwt, while%lwt, assert%lwt, match%lwt, if%lwt, and [%lwt raise ...] which are shortcuts of let%lwt followed by some OCaml statements. The usage of the let%lwt syntax is also preferred since it provides a better behavior with respect to exception backtraces.

Note: The lwt_ppx preprocessor can't be loaded by utop. We have to compile the program with the following dune stanzas:

(libraries lwt lwt.unix)
(preprocess (pps lwt_ppx))

The lwt library provides also a Lwt.both function wich permits the concurrent execution of two computations. The scheduling of the two virtual threads depends of explicit (Lwt.pause ()) or implicit (IO waits) pauses. The following program shows how the lwt scheduler can interleave the execution of two threads:

Lwt_main.run @@ Lwt.both
  begin
     let%lwt () = Lwt_io.write Lwt_io.stdout "line #1\n"
     and () = Lwt.pause () in
     Lwt_io.write Lwt_io.stdout "line #2\n"
  end
  begin
     let%lwt () = Lwt_io.write Lwt_io.stdout "line #3\n"
     and () = Lwt.pause () in
     Lwt_io.write Lwt_io.stdout "line #4\n"
  end

A custom monad: the state monad

It can be possible to create a new monad if the monads provided by the library doesn't fit a need. Here, we will implement a state monad. This monad carries a state which can be read by get and written by put.

With such a monad and the following program,

let (r,s) = run 42 @@
  let** a = get in
  let** () = put (a+1) in
  return 12
 in Printf.printf "result = %d state = %d\n" r s

we would have the result:

result = 12 state = 43

Here, run initializes the state and extract the result and the state from the evaluated monads.

Contrary to the option and list monads, the get monad must obtain the state from somewhere. Then the idea is to make run and bind to feed them with a state which will be passed through the sequence of monad. In other words, get is a function. It has an argument which contains the state, and returns a pair with the result and the new state.

With some monads, it can be adequate to hide the actual state and only permits accesses through the functions provided by the monad. A t type is provided to permits this. An abstract type should be declared by a corresponding .mli file.

An implementation of the state monad can be given by:

type 'state t = State of 'state
let return x (State s) = (x,State s)
let get (State s) = (s,State s)
let put x (State s) = ((),State x)

let bind f g (State s) =
  let (r, State s') = f (State s)
  in g r (State s')
let (>>=) = bind
let (let**) = bind

let run x f =
  let (r,State state) = f (State x)
  in (r,state)

The ppx_let preprocessor

We have seen that lwt comes with a preprocessor which defines some operators like let%lwt, match%lwt. We can acheive an analog declarations with the ppx_let preprocessor.

It declare the following token let%bind, let%map, match%bind, match%map, if%bind, if%map, while%bind. They are designed to work with a open Monad.Let_syntax statement where Monad is one of the monad from the Base/Core library from Jane Street.

With this notation, we can write:

open Core

let () =
  match 
    let open Option.Let_syntax in
    let%bind a = Some 42 in
    let%bind b = Some 6 in
    return (Printf.printf "%d\n" (a*b))
  with
  | Some () -> ()
  | None -> ()

A local open Option.Let_syntax can be handy if the program use different monads.

More information is available here ppx_let page. This includes the convention which must be applied on a custom monad if we want it to be compatible with ppx_let.

Appendix: monad rules

A monad can be created easily by defining:

However, to qualify as a monad, some rules must be verified:

Appendix: supported let operator characters

The following characters are supported when defining a let_ operator:

$ & * + - / = > @ ^ | <

It may be followed by multiple characters from this list:

$ & * + - / = > @ ^ | < ! ? % :

OCaml gives you some liberty about the let_ operator. The usual bind operator is (let*), and the map operator is (let+). The Lwt.Syntax module contains these operators.