Google is Balkanising the Global Chat Network

Google will split their portion of the global chat network off from everyone else on the 16th of February.

This balkanising move means that you won’t be able to use a Google-based chat account to contact people using non-Google chat account addresses anymore.

If you’re still using a Google-based account for chat, I recommend getting a chat account with a provider committed to open interoperability.

How to get a global, non-Google chat account

Any of the providers listed in the directory at xmpp.net should work well. I’ve personally had good recommendations for jabber.at, jabber.cz, and jabber.meta.net.nz.

You can use almost any chat client to connect; some web-based ones do exist but it’s probably simpler and less bother to download a chat client to run on your computer or cellphone directly.

I recommend:

It’s also easy to connect using your own DNS domain

For those of you who have DNS domains of your own, and would like a yourname@yourdomain.com chat address, I heartily recommend http://hosted.im/. All you do is add a couple of DNS records, register at the site, and you have your own personal chat domain. You can create accounts for yourself and your friends there, and have full interoperability and federation with the global chat network.

Resist Balkanisation; support independent providers

The XMPP global chat network is the only open, federated chat network still standing. As such, it deserves our support against the hostile moves of large companies like Google, Apple, and Microsoft. Supporting smaller chat account providers and/or running our own servers is a great way to help keep the internet commons healthy.

Monads in Dynamically-Typed Languages

In this post, I’ll build a monad library for Racket, using its generic interfaces feature in place of Haskell’s type classes.

There have been lots of demonstrations of monads in dynamically-typed languages before; notably, Oleg’s article on monads in Scheme. However, I haven’t yet (Update: see below) seen one that smoothly allows for return-type polymorphism.

This post introduces (1) coercions between monad representations and (2) “undecided” quasi-monadic placeholder values. Taken together, these give a Haskell-like feel for monads in a dynamically-typed language.

The techniques are illustrated using Racket, but aren’t specific to Racket.

The Challenge

How can we write monadic programs like this, where monad types are implicit, just as they are in comparable Haskell programs?

(run-io (do (mdisplay "Enter a number: ")
            n <- mread
            all-n <- (return (for/list [(i n)] i))
            evens <- (return (do i <- all-n
                                 #:guard (even? i)
                                 (return i)))
            (return evens)))

The overall computation uses do-notation at IO type, while the evens list uses do-notation at List type.

The problem

Simple monomorphic monads are easy. Here’s the List monad in Racket:

(define (fail)      (list))
(define (return x)  (list x))
(define (bind xs f) (append-map f xs))

Let’s add some simple do-notation sugar:

(define-syntax do
  (syntax-rules (<-)
    [(_ mexp)                 mexp]
    [(_ var <- mexp rest ...) (bind mexp (λ (var)   (do rest ...)))]
    [(_ mexp rest ...)        (bind mexp (λ (dummy) (do rest ...)))]
    [(_ #:guard exp rest ...) (if exp (do rest ...) (fail))]))

Now we can write list comprehensions. For example, let’s select the odd numbers from a list, and multiply each by two:

>>> (do i <- '(1 2 3 4 5)
        #:guard (odd? i)
        (return (* i 2)))
'(2 6 10)

Let’s define another monad! How about for Racket’s lazy streams?

(define (fail)     empty-stream)
(define (return x) (stream-cons x empty-stream))
(define (bind s f) (if (stream-empty? s)
                       empty-stream
                       (let walk ((items (f (stream-first s))))
                         (if (stream-empty? items)
                             (bind (stream-rest s) f)
                             (stream-cons (stream-first items)
                                          (walk (stream-rest items)))))))

This is a fine definition, as far as it goes, but we’ve just run into the first problem:

Monads behave differently at different types

Simply defining fail, return and bind as functions won’t work. We need some kind of dynamic method dispatch. Haskell does this by dispatching using dictionary passing.

But there’s a deeper problem, too:

Monads don’t learn their types until it’s too late

How should we choose which dictionary to use? It’s easy with bind, because the specific monad being bound is always given as bind’s first argument. The bind implementation can be a method on the monad’s class.

This doesn’t work for return or fail. When they’re called, they don’t know what kind of monad they should produce. That only becomes clear later, when their results flow into a bind.

Haskell uses the type system. We have to do it at runtime, on a case-by-case basis.

A solution

To address these problems, we explicitly represent the dictionary we need:

(struct monad-class (failer    ;; -> (M a)
                     returner  ;; a -> (M a)
                     binder    ;; (M a) (a -> (M b)) -> (M b)
                     coercer)) ;; N (M a) -> (N a)

The failer, returner and binder methods correspond to the monad operations we met above, but coercer is new. It lets us dynamically reinterpret a monad at a different monadic type. We’ll see its use below.

Now we can define our List monad class:

(define List
  (monad-class (λ () '())
               (λ (x) (list x))
               (λ (xs f) (append-map (λ (x) (coerce List (f x))) xs))
               not-coercable))

The not-coercable function is for use when it’s not possible to reinterpret the type of a monad, which is the overwhelmingly common case. It checks to make sure we’re already at the expected type, and raises an exception if we’re not.

(define (not-coercable N m)
  (if (eq? (monad->monad-class m) N)
      m
      (error 'coerce)))

We need some way of learning which dictionary to use for a given monad instance. It’s here that Racket’s generics come in:

(define-generics monad
  (monad->monad-class monad)
  #:defaults ([null? (define (monad->monad-class m) List)]
              [pair? (define (monad->monad-class m) List)]))

This declares a new generic interface, gen:monad, with a single method, monad->monad-class. It supplies implementations of gen:monad for the builtin types null? and pair?, in each case returning the List monad-class when asked.

Now we can implement bind and coerce, not just for a single monad, but for all monads:

(define (bind ma a->mb)
  (define binder (monad-class-binder (monad->monad-class ma)))
  (binder ma a->mb))

(define (coerce N ma)
  (define coercer (monad-class-coercer (monad->monad-class ma)))
  (coerce N ma))

They use monad->monad-class to find the relevant dictionary, extract the right method from the dictionary, and delegate to it.

Note that bind dispatches on its first argument, while coerce dispatches on its second argument: monad instances get to decide how they are bound and how (or if!) they are to be interpreted at other monad types.

Next, we introduce neutral quasi-monad instances for when return and fail don’t know which monad type to use. Because we might bind such a quasi-monad before it collapses into a particular type, we need to define a quasi-monad to represent binds in superposition, too.

(struct return (value)
        #:methods gen:monad [(define (monad->monad-class m) Return)])
(struct fail ()
        #:methods gen:monad [(define (monad->monad-class m) Fail)])
(struct suspended-bind (ma a->mb)
        #:methods gen:monad [(define (monad->monad-class m) Bind)])

The Return method dictionary looks very similar to the implementation of the identity monad, with the addition of a return-struct box around the carried value:

(define Return
  (monad-class (λ () (error 'fail))
               (λ (x) (return x))
               (λ (m f) (f (return-value m)))
               (λ (N m) ((monad-class-returner N) (return-value m)))))

There are two interesting things to note here:

  1. The bind implementation is a direct statement of one of the monad laws,

    return x >>= f ≡ f x

  2. The coerce implementation rebuilds the monad using the return implementation from the target monad.

The Return coerce implementation gives us the equivalent of return-type polymorphism, letting us avoid sprinkling monad type annotations throughout our code.

Fail and Bind are implemented similarly: 1

(define Fail
  (monad-class 'invalid
               'invalid
               (λ (ma a->mb) (suspended-bind ma a->mb))
               (λ (N m) ((monad-class-failer N)))))

The coerce method for an undetermined failure calls the new monad-class’s failer method to produce the correct representation for the final monad.

(define Bind
  (monad-class 'invalid
               'invalid
               (λ (ma a->mb) (suspended-bind ma a->mb))
               (λ (N m) (bind (coerce N (suspended-bind-ma m))
                              (suspended-bind-a->mb m)))))

Likewise, coerce for a suspended-bind re-binds its parameters, after first coercing them into the now-known monad.

You might have noticed a call to coerce in the bind implementation for List above:

(λ (xs f) (append-map (λ (x) (coerce List (f x))) xs))

This is necessary because f might simply (return 1), which is one of these neutral quasi-monad instances that needs to be told what kind of thing it should be before it can be used. On the other hand, f might have returned (list 1), in which case List’s coercer method will notice that no conversion needs to be done.

The IO Monad

Racket is an eager, imperative language, and so doesn’t need an IO monad the way Haskell does. But we can define one anyway, to see what it would be like.

It’s also useful as a template for implementing unusual control structures for monadic embedded domain-specific languages.

Values in the IO type denote sequences of delayed actions. We’ll represent these using two structure types, one for return and one for bind:

(struct io-return (thunk)
        #:methods gen:monad [(define (monad->monad-class m) IO)])
(struct io-chain (io k)
        #:methods gen:monad [(define (monad->monad-class m) IO)])

(define IO
  (monad-class (λ () (error 'fail))
               (λ (x) (io-return (λ () x)))
               (λ (ma a->mb) (io-chain ma a->mb))
               not-coercable))

Lifting Racket’s primitive input/output procedures into our IO type is straightforward:

(define (mdisplay x) (io-return (λ () (display x))))
(define mnewline     (io-return newline))
(define mread        (io-return read))

Finally, we need a way to actually execute a built-up chain of actions:

(define (run-io io)
  (match (coerce IO io)
    [(io-return thunk) (thunk)]
    [(io-chain io k) (run-io (k (run-io io)))]))

Revisiting the Challenge Problem

We now have all the pieces we need to run the challenge program given above.

The challenge comes from the end of Oleg’s article on monadic programming in Scheme. His example uses many different types of monad together, using the monomorphic technique he developed:

(IO::run
 (IO::>> (put-line "Enter a number: ")
         (IO::>>= (read-int)
                  (λ (n)
                    (IO::>>= (IO::return (for/list [(i n)] i))
                             (λ (all-n)
                               (IO::>>=
                                (IO::return
                                 (List:>>= all-n
                                           (λ (i)
                                             (if (even? i)
                                                 (List::return i)
                                                 (List::fail "odd")))))
                                (λ (evens) (IO::return evens)))))))))

With our polymorphic monads, we are able to use the generic do-notation macro defined above, and write instead

(run-io (do (mdisplay "Enter a number: ")
            n <- mread
            all-n <- (return (for/list [(i n)] i))
            evens <- (return (do i <- all-n
                                 #:guard (even? i)
                                 (return i)))
            (return evens)))

which compares favourably to the Haskell

main = do putStr "Enter a number: "
          n <- getInt
          allN <- return [0 .. n-1]
          evens <- return $ do i <- allN
                               guard $ i `mod` 2 == 0
                               return i
          return evens

Implementation

A somewhat fleshed-out implementation of these ideas is available here:

  • It uses prop:procedure, making monad-classes directly callable and allowing (coerce List foo) to be written as (List foo).

  • It uses the double-dispatch pattern to let monad instances flexibly adapt themselves to each other in coercions. For example, lists, streams and sets can all be considered interconvertible.

  • It uses Racket’s built-in exception type to represent pending failures instead of the simple structure sketched above. This way, failures may carry stack traces and an informative error message.

  • It includes a simple State monad, with sget and sput operations.

I based this work on some experiments I ran back in 2006 in writing a monadic library for Scheme. Racket offers much better support for object-oriented/generic programming (among many other things!) than vanilla Scheme does, which makes for smoother integration between my monad library and the rest of the language.

Ben Wolfson pointed me to his monad library for clojure in a reddit thread about this post. It seems to be using a very similar approach, with “neutral” carriers for monadic values that stay neutral until a later point in the program. The main difference seems to be that with my approach, no “run” step is needed for some monads, since monadic values carry their bind implementation with them. Ben’s library supplies the bind implementation at “run” time instead.

Update, 20150127: Paul Khuong pointed me to his Scheme monad implementation based around delimited continuations in this twitter conversation. It doesn’t have dynamic-dispatch for automatically determining the appropriate bind implementation, but I suspect it could be added; perhaps in his join operator, which makes an interesting contrast to the technique explored in this post. The really nice thing about the delimited continuation approach to monads is that the return is implicit. You never mention return explicitly in monadic code. That makes a big difference, because return-type polymorphism is only needed to handle explicit returns! By placing the return right next to the appropriate run operation, the delimited continuation technique neatly sidesteps the problem.

Conclusions

Haskell’s return is generic: it injects a value into any monad at all. Which specific monad is used depends on the context. This is impossible to directly implement in general without static analysis.

Lacking such static analyses in dynamically-typed languages, achieving the equivalent of Haskell’s return is challenging, even though implementing particular monomorphic variations is straightforward.

The trick is to make return produce a kind of superposition which doesn’t collapse to a specific monad type until it touches a monad that already knows what type it is.

In this post, I’ve shown how this kind of monad “coercion” works well to give the kind of return-type polymorphism we need to make monads feel in Racket similar to the way they do in Haskell.

  1. No implementations are given for their failer and returner methods, since they are only quasi-monad classes, and it’s not possible for user code to ever be in the Fail or Bind “monads”.

An interesting quote

“Giving a professional illustrator a goal for a poster usually results in what was desired. If one tries this with an artist, one will get what the artist needed to create that day. Sometimes we make, to have, sometimes to know and express.” — Alan Kay

From “The Power of the Context”.

Postfix authentication isn't general enough

I want to set up Postfix SMTP authentication so I have two different passwords to authenticate with: one main one that gives me access to both SMTP and IMAP, for my own use with trusted clients, and a separate one that gives me the ability to send messages via SMTP only, and that will not work for IMAP.

The reason is that gmail will not let me send mail as an auxiliary mail identity through its own SMTP servers anymore, and requires a username and password to authenticate itself to my own SMTP servers. Since I’m unhappy giving gmail my main email password, I decided a second gmail-specific password would be a good idea.

This turns out to be difficult.

Postfix will authenticate either against Dovecot, or using Cyrus-SASL. Cyrus-SASL can talk to a number of things, and one of them is a SQL database, but it won’t let you use crypt for the passwords stored in the database. That’s a showstopper there. The other alternative is to back Cyrus-SASL with PAM, but that involves figuring out PAM. Painful, and another link in the (already long) chain: Postfix → Cyrus-SASL → PAM → Database.

I’ve asked on the Freenode #cyrus channel about the possibility of getting a patch for crypted SQL-database passwords accepted, but no replies yet.

There are a couple of patches out there that get close to what I want; the best (but still not quite right) is this one. It needs work to support other crypt schemes and to extract salt properly.

Perhaps I will give in and learn how to configure PAM…

Update: At the suggestion of the kind people on #cyrus, I’ve decided instead to set up a separate account on my email server for gmail to log into, and simply forward any mail that’s delivered there (by accident, presumably) to my main account.

Thoughts on Common Lisp

I happened across a few notes I made back in October 2011 regarding my experience of using Common Lisp for the first time.

2011-10-14 14:23:55 tonyg

Yesterday I hacked on ~/src/cmsg/lisp.

Network programming wasn’t so hard to get a basic socket listener up and running.

Streams are horribly broken though. No clean separation of encoding from binary I/O. Swank vs raw Slime streams differ, which led to some interesting issues when experimenting with my I/O code interactively. flexi-streams help take away some of the pain, but it’s still pretty awful.

The loop macro is neat. Feels like comprehensions. Comprehensions are pleasant in Erlang and Haskell, and I remember enjoying SRFI-42, but Racket misses them1.

The lack of proper tail recursion is HORRIBLE. Forces some very nasty code. E.g. the tagbody thing.

That everything-is-an-object is GREAT. I wish Racket had this.

Racket’s match is awesome. Totally awesome. Common-lisp pattern matching is in PARLOUS state. There’s nothing remotely close to as nice as Racket’s match. There is no common standard.

Exceptions on end-of-stream simplify the programming of network layers.

2011-10-19 08:06:10 tonyg

The multiple-values implementation is kind of nice, especially in those places where it enables subtle design touches like using floor to get both quotient and remainder at once. That particular example appeals to me also for its mnemonicity: using truncate instead of floor gives you the other variant2.

  1. It seems clear to me today that I had been overlooking Racket’s for, for/list, for/hash etc. I’d still like to see something more general that abstracts away from the variations in these, though.

  2. (the difference is the same as the difference between R6RS’s div/mod and div0/mod0, respectively).

Minimart and Network Calculus at (fourth Racketcon)

Here’s me giving a talk about my research on Minimart and Network Calculus at (fourth Racketcon) on the 20th of September, 2014:

Downloadable Prebuilt Binary RabbitMQ Plugins

From time to time, I make binary snapshot downloads of the RabbitMQ plugins that I have developed. You will be able to find the most recent downloads from this page. I sign the downloadable files with my public key.

Older builds for previous versions of RabbitMQ may be available but not linked above; check the directory itself for a full list.

How to Run the RabbitMQ Tests

RabbitMQ includes a suite of functional tests, as well as unit tests for each of its components. It’s not immediately obvious how to run the functional test suite, but fortunately, it’s straightforward.

You will need:

  • a Java JDK
  • Ant
  • Erlang
  • git
  • and I guess a Unix machine of some description; I don’t imagine running these tests on Windows will work well

Running the complete test suite takes about six minutes on my eight-core, 32GB i7 Linux box. Most of this time is spent sleeping, to ensure particular interleavings required by particular test cases.

Clone the repositories
$ git clone git://github.com/rabbitmq/rabbitmq-public-umbrella
$ cd rabbitmq-public-umbrella
$ git clone git://github.com/rabbitmq/rabbitmq-codegen
$ git clone git://github.com/rabbitmq/rabbitmq-java-client
$ git clone git://github.com/rabbitmq/rabbitmq-server
$ git clone git://github.com/rabbitmq/rabbitmq-test
Build and start the server in one window
$ cd rabbitmq-server
$ make run
Run the tests in another window

The tests are written in Java and come along with the RabbitMQ Java client source code.

Be sure to wait until the server has fully initialised itself before starting ant.

$ cd rabbitmq-java-client
$ ant test-suite

Things should tick along nicely for a few minutes at this point. From time to time, you’ll see the server print a new banner to its console: the tests restart the server occasionally as part of their normal operation.

Check the results

The tests leave their output in rabbitmq-java-client/build/TEST-* files. Each test suite (FunctionalTests, ClientTests, ServerTests, HATests) produces both a plain-text and an XML file summarising the results of the run.

How best to do PGP/GPG key management with Git?

I have parts of my home directory stored in git. In particular, I have my .gnupg/pubring.gpg and .gnupg/trustdb.gpg files in git. This is fine so long as absolutely no branching ever takes place.

My question to you, O lazyweb, is: does a tool exist for merging pubring.gpg etc. in a sensible fashion?

If the answer is “no”, then does a tool exist for exploding the contents of pubring.gpg etc into a directory full of tiny text files which are amenable to standard 3-way-merge?

New benchmarks of NaCl and scrypt in the browser

Back in March, I (crudely) measured the performance of js-nacl and js-scrypt running in the Browser.

Since then, emscripten has improved its code generation significantly, so I’ve rerun my tests. I’ve also thrown in node.js for comparison.1

The takeaway is: asm.js makes a big difference to NaCl and scrypt performance - on Firefox, that is. Firefox is between 2× and 8× faster.2 Other browsers have benefited from the improvements, but not as much.

The setup:

  • Firefox/23.0
  • Chrome/28.0.1500.95
  • Safari/534.59.8
  • node/v0.10.15
  • Macbook Air late 2010 (3,1), 1.6 GHz Core 2 Duo, 4 GB RAM, OS X 10.6.8

I’m running emscripten at git revision b1eaf55e (sources), of 8 August 2013.

(The benchmarks I ran in March were run with rev 4e09482e (sources) of 16 Jan 2013.)

What has changed since the last measurements?

Emscripten’s support for asm.js code generation is much better, and I am also now able to turn on -O2 optimization without things breaking.

On the minus side, the previous builds included a severe memory leak (!) because by default Emscripten includes a stub malloc() and free() implementation that never releases any allocated memory. The current builds include dlmalloc(), and so no longer leak memory, but run ever so slightly slower by comparison to using the stub allocator.

Safari seems to have severe problems with the current builds. I’m unsure where the bug lies (probably emscripten?), but many calls to crypto_box/crypto_box_open and scrypt() yield incorrect results. There are missing entries in the charts below because of this. (No sense in measuring something that isn’t correct.)

Since the previous round, Firefox has gained support for window.crypto.getRandomValues. Hooray!

Hashing (short) strings/bytes with SHA-512

Firefox handily dominates here, making the others look roughly equivalent.

Hash operations (per sec)

(Approximate speedups since January: Chrome = 1.2×; Firefox = 8.3×; Safari = 1.4×; node = 0.95×. Chart.)

Computing a shared key from public/private keys

These are operations whose runtime is dominated by the computation of a Curve25519 operation. In three of the four cases, the operation is used to compute a Diffie-Hellman shared key from a public key and a secret key; in the remaining case (crypto_box_keypair_from_seed) it is used to compute a public key from a secret key. Firefox again dominates, but Chrome is not far behind.

This is one of the areas where Safari yields incorrect results, leading to a missing data point. I’m not yet sure what the cause is.

Shared-key computations (per sec)

(Approximate speedups since January: Chrome = 2.5×; Firefox = 4.8×; Safari = 2×; node = 1×. Chart.)

Computing random nonces

This is a thin wrapper over window.crypto.getRandomValues, or the node.js equivalent, and so has not benefited from the emscripten improvements. I’m including it just to give a feel for how fast randomness-generation is.

Safari wins hands-down here. I wonder how good the generated randomness is?

Random nonce generation (per sec)

(Approximate speedups since January: Chrome = 0.96×; Firefox = 1×; Safari = 1×; node = 1×. Chart.)

Authenticated encryption using a shared key

These are Salsa20/Poly1305 authenticated encryptions using a precomputed shared key. Broadly speaking, boxing was quicker than unboxing. Firefox again dominates.

This another of the areas where Safari yields incorrect results. I’m not yet sure why.

Secret-key operations (per sec)

(Approximate speedups since January: Chrome = 1.5×; Firefox = 2.3×; Safari = 1.3×; node = 1.2×. Chart.)

Producing and validating signatures

These operations compute an elliptic-curve operation, but use the result to produce a digital signature instead of an authenticated/encrypted box. Signature generation is much faster than signature validation here. As for the other elliptic-curve-heavy operations, Firefox is fastest, but Chrome is not far behind.

Signature operations (per sec)

(Approximate speedups since January: Chrome = 1.8×; Firefox = 4.7×; Safari = 3.8×; node = 0.82×. Chart.)

scrypt Key Derivation Function

Here, Safari not only underperforms significantly but computes incorrect results. As above, I’m not sure why.

Firefox is about twice as fast as previously at producing scrypt()-derived keys. Both Firefox and Chrome are usably fast.

Signature operations (per sec)

(Approximate speedups since January: Firefox improved by around 2×; the others were roughly unchanged. Chart.)

Conclusions

scrypt is still slow. Safari has problems with this code, or vice versa. Precompute Diffie-Hellman shared keys where you can. Emscripten + asm.js = very cool!

  1. Of course, it’s a bit silly to include node.js here, since it can simply link against libsodium and get native-speed, optimized routines. Perhaps a Firefox extension could include a native XPCOM component offering a similar speed boost.

  2. The benefit is not quite as much as I claimed it was based on eyeballing the numbers.