Patching gnome-flashback 3.20 to work with GNOME 3.21

I’m running Debian testing, with gnome-flashback. At present, it has installed gnome-flashback 3.20.2-1 alongside gnome-settings-daemon 3.21.90-2.

Symptoms

By default, there are some problems:

  • The “Displays” section of the config tool says only “Could not get screen information”. This prevents GUI access to a lot of functionality, including arrangement of multiple monitors.

  • The brightness adjustment keys do not work.

The root of the problem is that the DBus interface org.gnome.Mutter.DisplayConfig changed between GNOME 3.20 and GNOME 3.21; the GetResources method produces an additional result in 3.21 that is not present in 3.20’s version of the same interface.1

If you see messages similar to the following from stderr of gnome-settings-daemon at startup, you are suffering from this problem too:

(gnome-settings-daemon:7149): power-plugin-WARNING **: Could not create GnomeRRScreen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): wacom-plugin-WARNING **: Failed to create GnomeRRScreen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): color-plugin-WARNING **: failed to get screens: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): common-plugin-WARNING **: Failed to construct RR screen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'

Fixing the problem

To fix the problem, I patched and rebuilt gnome-flashback with these commands:

sudo apt-get build-dep gnome-flashback
apt-get source gnome-flashback
wget "https://eighty-twenty.org/files/gnome-flashback-hack-20160918.patch"
patch -p0 < gnome-flashback-hack-20160918.patch
(cd gnome-flashback-3.20.2; fakeroot ./debian/rules binary)

After these steps, you should have a new gnome-flashback binary, gnome-flashback-3.20.2/gnome-flashback/gnome-flashback. You can now move the existing one out of the way and install it:

sudo mv /usr/bin/gnome-flashback /usr/bin/gnome-flashback-AS-INSTALLED
sudo cp gnome-flashback-3.20.2/gnome-flashback/gnome-flashback /usr/bin
sudo chown root:root /usr/bin/gnome-flashback

Now restart both gnome-flashback and gnome-settings-daemon. (You should arrange for gnome-flashback to be started after gnome-settings-daemon.)

You should no longer have the stderr output complaining about the RR screen DBus signatures, and both the “Displays” section of the config tool and the brightness keys should work.

  1. It’s a shame that there’s no apparent protocol versioning in the GNOME system, or at least none that applies here. If DBus had a serialization language a little more like protobufs, it’d be possible to smoothly add fields in a backwards-compatible way. 

Extensible Double Dispatch for Racket

Both Racket’s object system and its (separate!) generic interface system offer single-dispatch object-oriented programming: the choice of method body to execute depends on the type of just one of the arguments given to the method, usually the first one.

In some cases, the first thing that a method will do is to decide what to do next based on the type of a second argument. This is called double dispatch, and it has a long history in object-oriented programming languages—at least as far back as the original Smalltalk.

As an example, consider implementing addition for classes representing numbers. A different method body would be needed for each pair of representations of numbers.

I stumbled across the need for something like this when implementing Operational Transformation (OT) for Racket. The macro operation-transformer in that code base is almost the double-dispatch macro from this post; the difference is that for operational transformation, the method concerned yields two results, and if the arguments are switched on the way in, they must be switched on the way out.

Basic Double Dispatch

Here’s a basic double-dispatch macro:

(define-syntax-rule (double-dispatch op (arg1 arg2) [pred? body ...] ...)
  (cond
    [(pred? arg2) body ...] ...
    [else (error 'op "Unimplemented for ~v and ~v" arg1 arg2)]))

It assumes that it will be used in a method where dispatch has already been done on arg1, and that the next step is to inspect arg2. It applies the pred?s in sequence until one of them answers true, and then evaluates the corresponding body. If none of the pred?s hold, it signals an error.

It’s often convenient to use it inside a class definition or generic interface implementation with the following macros, which simply define op to delegate immediately to double-dispatch. The first is to be used with Racket’s object system, where the first argument is bound implicitly to this and where predicates should use Racket’s is-a? function. The second is to be used with Racket’s generic interface system, where both arguments are explicitly specified and predicates are more general.

(define-syntax-rule (define/public/double-dispatch (op arg2) [class body ...] ...)
  (define/public (op arg2)
    (double-dispatch (lambda (a b) (send a op b)) (this arg2)
      [(lambda (v) (is-a? v class)) body ...] ...)))

(define-syntax-rule (define/double-dispatch (op arg1 arg2) [pred? body ...] ...)
  (define (op arg1 arg2)
    (double-dispatch op (arg1 arg2) [pred? body ...] ...)))

Commutative Double Dispatch

For commutative operations like addition, it’s common to see the same code appear for adding an A to a B as for adding a B to an A.

The next macro automatically flips its arguments and tries again to see if B’s method has support for A, if it can’t find support for B within A’s method. That way, code for combining B with A need only be supplied in one place. It uses a parameter to keep track of whether it’s currently trying out a flipped pair of arguments.

(define trying-flipped? (make-parameter #f))

(define-syntax-rule (commutative-double-dispatch op (arg1 arg2) [pred? body ...] ...)
  (cond
    [(pred? arg2) (parameterize ((trying-flipped? #f)) body ...)] ...
    [(trying-flipped?) (error 'op "Unimplemented for ~v and ~v" arg2 arg1)]
    [else (parameterize ((trying-flipped? #t)) (op arg2 arg1))]))

Writing a simple wrapper works well for using commutative-double-dispatch in a class definition:

(define-syntax-rule (define/public/commutative-double-dispatch (op arg2) [class body ...] ...)
  (define/public (op arg2)
    (commutative-double-dispatch (lambda (a b) (send a op b)) (this arg2)
      [(lambda (v) (is-a? v class)) body ...] ...)))

but a wrapper for use with the generic interface system needs to take care not to accidentally shadow the outer dispatch mechanism. This macro uses define/generic to make op* an alias of op that always does a full dispatch on its arguments:

(define-syntax-rule (define/commutative-double-dispatch (op arg1 arg2) [pred? body ...] ...)
  (begin (define/generic op* op)
         (define (op arg1 arg2)
           (commutative-double-dispatch op* (arg1 arg2) [pred? body ...] ...))))

Examples

Let’s see the system in operation! First, using Racket’s object system, and then using Racket’s generic interfaces.

Example Scenario

We will first define two types of value foo and bar, each responding to a single doubly-dispatched method, operator which produces results according to the following table:

     | foo | bar |
-----|-----|-----|
 foo | foo | bar |
 bar | bar | foo |
-----|-----|-----|

Then, we’ll extend the system to include a third type, zot, which yields a zot when combined with any of the three types.

Double Dispatch with Classes

(define foo%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [foo% (new foo%)]
      [bar% (new bar%)])))

(define bar%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [bar% (new foo%)])))

Some tests show that this is doing what we expect. Notice that we get the right result when the first operand is a bar% and the second a foo%, even though bar% only explicitly specified the case for when the second operand is also a bar%. This shows the automatic argument-flipping in operation.

(module+ test
  (require rackunit)
  (check-true (is-a? (send (new foo%) operator (new foo%)) foo%))
  (check-true (is-a? (send (new foo%) operator (new bar%)) bar%))
  (check-true (is-a? (send (new bar%) operator (new foo%)) bar%))
  (check-true (is-a? (send (new bar%) operator (new bar%)) foo%)))

Double Dispatch with Generic Interfaces

(define-generics operand
  (operator operand other))

(struct foo ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [foo? (foo)]
     [bar? (bar)])])

(struct bar ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [bar? (foo)])])

The tests show the same argument-flipping behavior as for the object system above.

(module+ test
  (require rackunit)
  (check-true (foo? (operator (foo) (foo))))
  (check-true (bar? (operator (foo) (bar))))
  (check-true (bar? (operator (bar) (foo))))
  (check-true (foo? (operator (bar) (bar)))))

Extending The Example

First, we implement and test class zot%

(define zot%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [foo% (new zot%)]
      [bar% (new zot%)]
      [zot% (new zot%)])))

(module+ test
  (require rackunit)
  (check-true (is-a? (send (new foo%) operator (new zot%)) zot%))
  (check-true (is-a? (send (new bar%) operator (new zot%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new foo%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new bar%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new zot%)) zot%)))

… and then implement and test struct zot.

(struct zot ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [foo? (zot)]
     [bar? (zot)]
     [zot? (zot)])])

(module+ test
  (require rackunit)
  (check-true (zot? (operator (foo) (zot))))
  (check-true (zot? (operator (bar) (zot))))
  (check-true (zot? (operator (zot) (foo))))
  (check-true (zot? (operator (zot) (bar))))
  (check-true (zot? (operator (zot) (zot)))))

Conclusion

Double dispatch is a useful addition to the object-oriented programmer’s toolkit, and can be straightforwardly added to both of Racket’s object systems using its macro facility.


This post was written as executable, literate Racket. You can download the program from here.

Our operating systems are incorrectly factored

Unix famously represents all content as byte sequences. This was a great step forward, offering a way of representing arbitrary information without forcing an interpretation on it.

However, it is not enough. Unix is an incomplete design. Supporting only byte sequences, and nothing else, has caused wasted effort, code duplication, and bugs.

Text is an obvious example of the problem

Consider just one data type: text. It has a zillion character sets and encoding schemes. Each application must decide, on its own, which encoding of which character set is being used for a given file.

When applications get this wrong, both obvious bugs like Mojibake and subtler flaws like the IDN homograph attack result.

Massive duplication of code and effort

Lack of system support for text yields massive code duplication. Rather than having a system-wide, comprehensive model of text representation, encoding, display, input, collation, and comparison, each programming language and application must fend for itself.

Because it is difficult and time consuming to properly handle text, developers tend to skimp on text support. Where a weakness is identified, it must be repaired in each application individually rather than at the system level. This is itself difficult and time consuming.

Inconsistent treatment

Finally, dealing only with byte sequences precludes consistent user interface design.

Consider a recent enhancement to Thunderbird, landing in version 45.0. Previously, when exporting an address book as CSV, only the “system character set” was supported. Now, the user must specify which character set and encoding is to be used:

Illustration from the Thunderbird 45.0 release notes

The user cannot simply work with a file containing text; they must make a decision about which encoding to use. Woe betide them if they choose incorrectly.

A consistent approach would separate the question of text encodings entirely from application-specific UIs. System UI for transcoding would exist in one place, common to all applications.

User frustration

A tiny fraction of the frustration this kind of thing causes is recorded in Thunderbird’s bug 117236.

Notice that it took fourteen years to be fixed.

Ubiquitous problem

This Thunderbird change is just one example. Each and every application suffers the same problems, and must have its text support repaired, upgraded, and enhanced independently.

It’s not only a Unix problem. Windows and OS X are just as bad. They, too, offer no higher-level model than byte sequences to their applications. Even Android is a missed opportunity.

Learn from the past

Systems like Smalltalk, for all their flaws, offer a higher-level model to programmers and users alike. In many cases, the user never need learn about text encoding variations.

Instead, the system can separate text from its encoding.

Where encoding is relevant to the user, there can be a single place to work with it. Contrast this with the many places encoding leaks into application UIs today, just one of which is shown in the Thunderbird example above.

It’s not just text

Text is just one example. Pictures are another. You can probably think of more.

Our operating systems do not support sharing of high-level abstractions of data between documents or applications.

An operating system with a mechanism for doing so would take a great burden off both programmers and users.

Let’s start thinking about what a better modern operating system would look like.

Javascript syntax extensions using Ohm

Programming language designers often need to experiment with syntax for their new language features. When it comes to Javascript, we rely on language preprocessors, since altering a Javascript engine directly is out of the question if we want our experiments to escape the lab.

Ohm is “a library and domain-specific language for parsing and pattern matching.” In this post, I’m going to use it as a Javascript language preprocessor. I’ll build a simple compiler for ES5 extended with a new kind of for loop, using Ohm and the ES5 grammar included with it.

All the code in this post is available in a Github repo.

Our toy extension: “for five”

We will add a “for five” statement to ES5, which will let us write programs like this:

for five as x { console.log("We have had", x, "iterations so far"); }

The new construct simply runs its body five times in a row, binding a loop variable in the body. Running the program above through our compiler produces:

for (var x = 0; x < 5; x++) { console.log("We have had", x, "iterations so far"); }

Extending the ES5 grammar

We write our extension to the ES5 grammar in a new file for5.ohm as follows:

For5 <: ES5 {
  IterationStatement += for five as identifier Statement  -- for5_named

  five = "five" ~identifierPart
  as = "as" ~identifierPart

  keyword += five
           | as
}

Let’s take this a piece at a time. First of all, the declaration For5 <: ES5 tells Ohm that the new grammar should be called For5, and that it inherits from a grammar called ES5. Next,

  IterationStatement += for five as identifier Statement  -- for5_named

extends the existing ES5 grammar’s IterationStatement nonterminal with a new production that will be called IterationStatement_for5_named.

Finally, we define two new nonterminals as convenient shorthands for parsing the two new keywords, and augment the existing keyword definition:

five = "five" ~identifierPart
as = "as" ~identifierPart

keyword += five
         | as

There are three interesting points to be made about keywords:

  • First of all, making something a keyword rules it out as an identifier. In our extended language, writing var five = 5 is a syntax error. Define new keywords with care!

  • We make sure to reject input tokens that have our new keywords as a prefix by defining them as their literal text followed by anything that cannot be parsed as a part of an identifier, ~identifierPart. That way, the compiler doesn’t get confused by, say, fivetimes or five_more, which remain valid identifiers.

  • By making sure to extend keyword, tooling such as syntax highlighters can automatically take advantage of our extension, if they are given our extended grammar.

Translating source code using the new grammar

First, require the ohm-js NPM module and its included ES5 grammar:

var ohm = require('ohm-js');
var ES5 = require('ohm-js/examples/ecmascript/es5.js');

Next, load our extended grammar from its definition in for5.ohm, and compile it. When we compile the grammar, we pass in a namespace that makes the ES5 grammar available under the name our grammar expects, ES5:

var grammarSource = fs.readFileSync(path.join(__dirname, 'for5.ohm')).toString();
var grammar = ohm.grammar(grammarSource, { ES5: ES5.grammar });

Finally, we define the translation from our extended language to plain ES5. To do this, we extend a semantic function, modifiedSource, adding a method for each new production rule. Ohm automatically uses defaults for rules not mentioned in our extension.

var semantics = grammar.extendSemantics(ES5.semantics);
semantics.extendAttribute('modifiedSource', {
  IterationStatement_for5_named: function(_for, _five, _as, id, body) {
    var c = id.asES5;
    return 'for (var '+c+' = 0; '+c+' < 5; '+c+'++) ' + body.asES5;
  }
});

Each parameter to the IterationStatement_for5_named method is a syntax tree node corresponding positionally to one of the tokens in the definition of the parsing rule. Accessing the asES5 attribute of a syntax tree node computes its translated source code. This is done with recursive calls to the modifiedSource attribute where required.

Our compiler is, at this point, complete. To use it, we need code to feed it input and print the results:

function compileExtendedSource(inputSource) {
  var parseResult = grammar.match(inputSource);
  if (parseResult.failed()) console.error(parseResult.message);
  return parseResult.succeeded() && semantics(parseResult).asES5;
}

That’s it!

> compileExtendedSource("for five as x { console.log(x); }");
'for (var x = 0; x < 5; x++) { console.log(x); }'

Discussion

This style of syntactic extension is quite coarse-grained: we must translate whole compilation units at once, and must specify our extensions separately from the code making use of them. There is no way of adding a local syntax extension scoped precisely to a block of code that needs it (known to Schemers as let-syntax). For Javascript, sweet.js offers a more Schemely style of syntax extension than the one explored in this post.

Mention of sweet.js leads me to the thorny topic of hygiene. Ohm is a parsing toolkit. It lets you define new concrete syntax, but doesn’t know anything about scope, or about how you intend to use identifiers. After all, it can be used for languages that don’t necessarily even have identifiers. So when we write extensions in the style I’ve presented here, we must write our translations carefully to avoid unwanted capture of identifiers. This is a tradeoff: the broad generality of Ohm’s parsing in exchange for less automation in identifier handling.

Ohm’s extensible grammars let us extend any part of the language, not just statements or expressions. We can specify new comment syntax, new string syntax, new formal argument list syntax, and so on. Because Ohm is based on parsing expression grammars, it offers scannerless parsing. Altering or extending a language’s lexical syntax is just as easy as altering its grammar.

Conclusion

We have defined an Ohm-based compiler for an extension to ES5 syntax, using only a few lines of code. Each new production rule requires, roughly, one line of grammar definition, and a short method defining its translation into simpler constructs.

You can try out this little compiler, and maybe experiment with your own extensions, by cloning its Github repo.

Racket alists vs. hashtables: which is faster when?

Joe Marshall recently measured alist vs hashtable lookup time for MIT/GNU Scheme. I thought I’d do the same for Racket.

I used a 64-bit build of Racket, version 6.3.0.3.

I ran some very quick-and-dirty informal experiments on my Acer C720 laptop, which is running a 64-bit Debian system and has 2GB RAM and a two-core Celeron 2955U at 1.40GHz.

I measured approximate alist and hash table performance for

  • fixnum keys, using eq? as the lookup predicate (so assq, hasheq) (fixnum program)
  • length-64 byte vector keys, using equal? as the lookup predicate (so assoc, hash) (byte-vector program)

Each chart below has four data series:

  • probe/alist, average time taken to search for a key that is present in an alist
  • probe/alist/missing, average time taken to search for a key that is not present in an alist
  • probe/hasheq or probe/hash, average time taken to search for a key that is present in a hash table
  • probe/hasheq/missing or probe/hasheq/missing, average time taken to search for a key that is not present in a hash table

Fixnum keys

Here are average timings for fixnum keys:

Results for fixnums and assq/hasheq

Things to note:

  • Alists are here always faster than hasheq tables for 7 keys or fewer, whether the key is present or not.

  • When the key is present in the lookup table, alists are on average faster up to around 14 keys or so.

Length-64 byte vector keys

Here are average timings for length-64 random byte vector keys:

Results for length-64 byte vectors and assoc/hash

Things to note:

  • Alists are here always faster than hasheq tables for 4 keys or fewer, whether the key is present or not.

  • When the key is present in the lookup table, alists are on average faster up to around 16 keys or so.

Conclusions

Alists will be faster when you have very few keys - for eq?, around seven or fewer, or for equal?, perhaps only as many as four, depending on the size of each key.

If you expect with high probability that a given key will be present in the table, the picture changes slightly: then, alists may be faster on average up to around perhaps fifteen keys. Specifics of the insertion order of your keys will naturally be very important in this case.

Resources

The programs I wrote:

The data I collected:

gRPC.io is interestingly different from CORBA

gRPC looks very interesting. From a quick browse of the site, it looks like it differs from CORBA primarily in that

  • It is first-order.
  • It eschews exceptions.
  • It supports streaming requests and/or responses.

(That’s setting aside differences between protobufs and GIOP.)

It’s the first point that I think is likely to be the big win. Much of the complexity I saw with CORBA was to do with trying to pass object (i.e. service endpoint) references back and forth in a transparent way. Drop that misfeature, and everything from the IDL to the protocol to the frameworks to the error handling to the implementations of services themselves will be much simpler.

The way streaming is integrated is interesting too. There’s a clear separation between (finite) data, including lists/arrays, in the protobuf message-definition language, and (possibly non-finite) behavior in the gRPC service-definition language. Streams, being coinductive, fit naturally in the service-definition part.

Saving Images Despite Unfriendly Websites

From time to time, one stumbles across a website that has gone out of its way to make it difficult to save images displayed on the page. A common tactic is to disable right-click for the elements concerned.

The following bookmarklet is a simple workaround.

To use it,

  1. drag it to your bookmarks bar
  2. when you’re on a page that won’t let you right-click to save images, click the bookmarklet.
  3. now, click on any image in the page to simply make your browser go directly to that image.
  4. from here, you can use the browser’s “save” functionality directly.

Downloadable Prebuilt Binary RabbitMQ Plugins

From time to time, I make binary snapshot downloads of the RabbitMQ plugins that I have developed. You will be able to find the most recent downloads from this page. I sign the downloadable files with my public key.

Older builds for previous versions of RabbitMQ may be available but not linked above; check the directory itself for a full list.

Google is Balkanising the Global Chat Network

Google will split their portion of the global chat network off from everyone else on the 16th of February.

This balkanising move means that you won’t be able to use a Google-based chat account to contact people using non-Google chat account addresses anymore.

If you’re still using a Google-based account for chat, I recommend getting a chat account with a provider committed to open interoperability.

How to get a global, non-Google chat account

Any of the providers listed in the directory at xmpp.net should work well. I’ve personally had good recommendations for jabber.at, jabber.cz, and jabber.meta.net.nz.

You can use almost any chat client to connect; some web-based ones do exist but it’s probably simpler and less bother to download a chat client to run on your computer or cellphone directly.

I recommend:

It’s also easy to connect using your own DNS domain

For those of you who have DNS domains of your own, and would like a yourname@yourdomain.com chat address, I heartily recommend http://hosted.im/. All you do is add a couple of DNS records, register at the site, and you have your own personal chat domain. You can create accounts for yourself and your friends there, and have full interoperability and federation with the global chat network.

Resist Balkanisation; support independent providers

The XMPP global chat network is the only open, federated chat network still standing. As such, it deserves our support against the hostile moves of large companies like Google, Apple, and Microsoft. Supporting smaller chat account providers and/or running our own servers is a great way to help keep the internet commons healthy.

Monads in Dynamically-Typed Languages

In this post, I’ll build a monad library for Racket, using its generic interfaces feature in place of Haskell’s type classes.

There have been lots of demonstrations of monads in dynamically-typed languages before; notably, Oleg’s article on monads in Scheme. However, I haven’t yet (Update: see below) seen one that smoothly allows for return-type polymorphism.

This post introduces (1) coercions between monad representations and (2) “undecided” quasi-monadic placeholder values. Taken together, these give a Haskell-like feel for monads in a dynamically-typed language.

The techniques are illustrated using Racket, but aren’t specific to Racket.

The Challenge

How can we write monadic programs like this, where monad types are implicit, just as they are in comparable Haskell programs?

(run-io (do (mdisplay "Enter a number: ")
            n <- mread
            all-n <- (return (for/list [(i n)] i))
            evens <- (return (do i <- all-n
                                 #:guard (even? i)
                                 (return i)))
            (return evens)))

The overall computation uses do-notation at IO type, while the evens list uses do-notation at List type.

The problem

Simple monomorphic monads are easy. Here’s the List monad in Racket:

(define (fail)      (list))
(define (return x)  (list x))
(define (bind xs f) (append-map f xs))

Let’s add some simple do-notation sugar:

(define-syntax do
  (syntax-rules (<-)
    [(_ mexp)                 mexp]
    [(_ var <- mexp rest ...) (bind mexp (λ (var)   (do rest ...)))]
    [(_ mexp rest ...)        (bind mexp (λ (dummy) (do rest ...)))]
    [(_ #:guard exp rest ...) (if exp (do rest ...) (fail))]))

Now we can write list comprehensions. For example, let’s select the odd numbers from a list, and multiply each by two:

>>> (do i <- '(1 2 3 4 5)
        #:guard (odd? i)
        (return (* i 2)))
'(2 6 10)

Let’s define another monad! How about for Racket’s lazy streams?

(define (fail)     empty-stream)
(define (return x) (stream-cons x empty-stream))
(define (bind s f) (if (stream-empty? s)
                       empty-stream
                       (let walk ((items (f (stream-first s))))
                         (if (stream-empty? items)
                             (bind (stream-rest s) f)
                             (stream-cons (stream-first items)
                                          (walk (stream-rest items)))))))

This is a fine definition, as far as it goes, but we’ve just run into the first problem:

Monads behave differently at different types

Simply defining fail, return and bind as functions won’t work. We need some kind of dynamic method dispatch. Haskell does this by dispatching using dictionary passing.

But there’s a deeper problem, too:

Monads don’t learn their types until it’s too late

How should we choose which dictionary to use? It’s easy with bind, because the specific monad being bound is always given as bind’s first argument. The bind implementation can be a method on the monad’s class.

This doesn’t work for return or fail. When they’re called, they don’t know what kind of monad they should produce. That only becomes clear later, when their results flow into a bind.

Haskell uses the type system. We have to do it at runtime, on a case-by-case basis.

A solution

To address these problems, we explicitly represent the dictionary we need:

(struct monad-class (failer    ;; -> (M a)
                     returner  ;; a -> (M a)
                     binder    ;; (M a) (a -> (M b)) -> (M b)
                     coercer)) ;; N (M a) -> (N a)

The failer, returner and binder methods correspond to the monad operations we met above, but coercer is new. It lets us dynamically reinterpret a monad at a different monadic type. We’ll see its use below.

Now we can define our List monad class:

(define List
  (monad-class (λ () '())
               (λ (x) (list x))
               (λ (xs f) (append-map (λ (x) (coerce List (f x))) xs))
               not-coercable))

The not-coercable function is for use when it’s not possible to reinterpret the type of a monad, which is the overwhelmingly common case. It checks to make sure we’re already at the expected type, and raises an exception if we’re not.

(define (not-coercable N m)
  (if (eq? (monad->monad-class m) N)
      m
      (error 'coerce)))

We need some way of learning which dictionary to use for a given monad instance. It’s here that Racket’s generics come in:

(define-generics monad
  (monad->monad-class monad)
  #:defaults ([null? (define (monad->monad-class m) List)]
              [pair? (define (monad->monad-class m) List)]))

This declares a new generic interface, gen:monad, with a single method, monad->monad-class. It supplies implementations of gen:monad for the builtin types null? and pair?, in each case returning the List monad-class when asked.

Now we can implement bind and coerce, not just for a single monad, but for all monads:

(define (bind ma a->mb)
  (define binder (monad-class-binder (monad->monad-class ma)))
  (binder ma a->mb))

(define (coerce N ma)
  (define coercer (monad-class-coercer (monad->monad-class ma)))
  (coercer N ma))

They use monad->monad-class to find the relevant dictionary, extract the right method from the dictionary, and delegate to it.

Note that bind dispatches on its first argument, while coerce dispatches on its second argument: monad instances get to decide how they are bound and how (or if!) they are to be interpreted at other monad types.

Next, we introduce neutral quasi-monad instances for when return and fail don’t know which monad type to use. Because we might bind such a quasi-monad before it collapses into a particular type, we need to define a quasi-monad to represent binds in superposition, too.

(struct return (value)
        #:methods gen:monad [(define (monad->monad-class m) Return)])
(struct fail ()
        #:methods gen:monad [(define (monad->monad-class m) Fail)])
(struct suspended-bind (ma a->mb)
        #:methods gen:monad [(define (monad->monad-class m) Bind)])

The Return method dictionary looks very similar to the implementation of the identity monad, with the addition of a return-struct box around the carried value:

(define Return
  (monad-class (λ () (error 'fail))
               (λ (x) (return x))
               (λ (m f) (f (return-value m)))
               (λ (N m) ((monad-class-returner N) (return-value m)))))

There are two interesting things to note here:

  1. The bind implementation is a direct statement of one of the monad laws,

    return x >>= f ≡ f x
    
  2. The coerce implementation rebuilds the monad using the return implementation from the target monad.

The Return coerce implementation gives us the equivalent of return-type polymorphism, letting us avoid sprinkling monad type annotations throughout our code.

Fail and Bind are implemented similarly: 1

(define Fail
  (monad-class 'invalid
               'invalid
               (λ (ma a->mb) (suspended-bind ma a->mb))
               (λ (N m) ((monad-class-failer N)))))

The coerce method for an undetermined failure calls the new monad-class’s failer method to produce the correct representation for the final monad.

(define Bind
  (monad-class 'invalid
               'invalid
               (λ (ma a->mb) (suspended-bind ma a->mb))
               (λ (N m) (bind (coerce N (suspended-bind-ma m))
                              (suspended-bind-a->mb m)))))

Likewise, coerce for a suspended-bind re-binds its parameters, after first coercing them into the now-known monad.

You might have noticed a call to coerce in the bind implementation for List above:

(λ (xs f) (append-map (λ (x) (coerce List (f x))) xs))

This is necessary because f might simply (return 1), which is one of these neutral quasi-monad instances that needs to be told what kind of thing it should be before it can be used. On the other hand, f might have returned (list 1), in which case List’s coercer method will notice that no conversion needs to be done.

The IO Monad

Racket is an eager, imperative language, and so doesn’t need an IO monad the way Haskell does. But we can define one anyway, to see what it would be like.

It’s also useful as a template for implementing unusual control structures for monadic embedded domain-specific languages.

Values in the IO type denote sequences of delayed actions. We’ll represent these using two structure types, one for return and one for bind:

(struct io-return (thunk)
        #:methods gen:monad [(define (monad->monad-class m) IO)])
(struct io-chain (io k)
        #:methods gen:monad [(define (monad->monad-class m) IO)])

(define IO
  (monad-class (λ () (error 'fail))
               (λ (x) (io-return (λ () x)))
               (λ (ma a->mb) (io-chain ma a->mb))
               not-coercable))

Lifting Racket’s primitive input/output procedures into our IO type is straightforward:

(define (mdisplay x) (io-return (λ () (display x))))
(define mnewline     (io-return newline))
(define mread        (io-return read))

Finally, we need a way to actually execute a built-up chain of actions:

(define (run-io io)
  (match (coerce IO io)
    [(io-return thunk) (thunk)]
    [(io-chain io k) (run-io (k (run-io io)))]))

Revisiting the Challenge Problem

We now have all the pieces we need to run the challenge program given above.

The challenge comes from the end of Oleg’s article on monadic programming in Scheme. His example uses many different types of monad together, using the monomorphic technique he developed:

(IO::run
 (IO::>> (put-line "Enter a number: ")
         (IO::>>= (read-int)
                  (λ (n)
                    (IO::>>= (IO::return (for/list [(i n)] i))
                             (λ (all-n)
                               (IO::>>=
                                (IO::return
                                 (List:>>= all-n
                                           (λ (i)
                                             (if (even? i)
                                                 (List::return i)
                                                 (List::fail "odd")))))
                                (λ (evens) (IO::return evens)))))))))

With our polymorphic monads, we are able to use the generic do-notation macro defined above, and write instead

(run-io (do (mdisplay "Enter a number: ")
            n <- mread
            all-n <- (return (for/list [(i n)] i))
            evens <- (return (do i <- all-n
                                 #:guard (even? i)
                                 (return i)))
            (return evens)))

which compares favourably to the Haskell

main = do putStr "Enter a number: "
          n <- getInt
          allN <- return [0 .. n-1]
          evens <- return $ do i <- allN
                               guard $ i `mod` 2 == 0
                               return i
          return evens

Implementation

A somewhat fleshed-out implementation of these ideas is available here:

  • It uses prop:procedure, making monad-classes directly callable and allowing (coerce List foo) to be written as (List foo).

  • It uses the double-dispatch pattern to let monad instances flexibly adapt themselves to each other in coercions. For example, lists, streams and sets can all be considered interconvertible.

  • It uses Racket’s built-in exception type to represent pending failures instead of the simple structure sketched above. This way, failures may carry stack traces and an informative error message.

  • It includes a simple State monad, with sget and sput operations.

I based this work on some experiments I ran back in 2006 in writing a monadic library for Scheme. Racket offers much better support for object-oriented/generic programming (among many other things!) than vanilla Scheme does, which makes for smoother integration between my monad library and the rest of the language.

Ben Wolfson pointed me to his monad library for clojure in a reddit thread about this post. It seems to be using a very similar approach, with “neutral” carriers for monadic values that stay neutral until a later point in the program. The main difference seems to be that with my approach, no “run” step is needed for some monads, since monadic values carry their bind implementation with them. Ben’s library supplies the bind implementation at “run” time instead.

Update, 20150127: Paul Khuong pointed me to his Scheme monad implementation based around delimited continuations in this twitter conversation. It doesn’t have dynamic-dispatch for automatically determining the appropriate bind implementation, but I suspect it could be added; perhaps in his join operator, which makes an interesting contrast to the technique explored in this post. The really nice thing about the delimited continuation approach to monads is that the return is implicit. You never mention return explicitly in monadic code. That makes a big difference, because return-type polymorphism is only needed to handle explicit returns! By placing the return right next to the appropriate run operation, the delimited continuation technique neatly sidesteps the problem.

Conclusions

Haskell’s return is generic: it injects a value into any monad at all. Which specific monad is used depends on the context. This is impossible to directly implement in general without static analysis.

Lacking such static analyses in dynamically-typed languages, achieving the equivalent of Haskell’s return is challenging, even though implementing particular monomorphic variations is straightforward.

The trick is to make return produce a kind of superposition which doesn’t collapse to a specific monad type until it touches a monad that already knows what type it is.

In this post, I’ve shown how this kind of monad “coercion” works well to give the kind of return-type polymorphism we need to make monads feel in Racket similar to the way they do in Haskell.

  1. No implementations are given for their failer and returner methods, since they are only quasi-monad classes, and it’s not possible for user code to ever be in the Fail or Bind “monads”.