From time to time, I make binary snapshot downloads of the RabbitMQ
plugins that I have developed. You will be able to find the most
recent downloads from this page. I sign the downloadable files with
my public key.
Google will split their portion of the global chat network off from
everyone else on the 16th of February.
This balkanising move means that you won’t be able to use a
Google-based chat account to contact people using non-Google chat
account addresses anymore.
If you’re still using a Google-based account for chat, I recommend
getting a chat account with a provider committed to open
interoperability.
You can use almost any chat client to connect; some web-based ones do
exist but it’s probably simpler and less bother to download a chat
client to run on your computer or cellphone directly.
It’s also easy to connect using your own DNS domain
For those of you who have DNS domains of your own, and would like a
yourname@yourdomain.com chat address, I heartily recommend
http://hosted.im/. All you do is add a couple of DNS records,
register at the site, and you have your own personal chat domain. You
can create accounts for yourself and your friends there, and have full
interoperability and federation with the global chat network.
Resist Balkanisation; support independent providers
The XMPP global chat network is the only open, federated chat
network still standing. As such, it deserves our support against the
hostile moves of large companies like Google, Apple, and Microsoft.
Supporting smaller chat account providers and/or running our own
servers is a great way to help keep the internet commons healthy.
In this post, I’ll build a monad
library for Racket, using its
generic interfaces
feature in place of Haskell’s type classes.
There have been lots of demonstrations of monads in dynamically-typed
languages before; notably,
Oleg’s article on monads in Scheme. However, I haven’t
yet (Update:see below) seen one that smoothly
allows for return-type polymorphism.
This post introduces (1) coercions between monad representations and
(2) “undecided” quasi-monadic placeholder values. Taken together,
these give a Haskell-like feel for monads in a dynamically-typed
language.
The techniques are illustrated using Racket, but aren’t specific to
Racket.
The Challenge
How can we write monadic programs like this, where monad types are
implicit, just as they are in comparable Haskell programs?
The overall computation uses do-notation at IO type, while the
evens list uses do-notation at List type.
The problem
Simple monomorphic monads are easy. Here’s the List monad in Racket:
Let’s add some simple do-notation sugar:
Now we can write list comprehensions. For example, let’s select the
odd numbers from a list, and multiply each by two:
This is a fine definition, as far as it goes, but we’ve just run into
the first problem:
Monads behave differently at different types
Simply defining fail, return and bind as functions won’t work.
We need some kind of dynamic method dispatch. Haskell does this by
dispatching using
dictionary passing.
But there’s a deeper problem, too:
Monads don’t learn their types until it’s too late
How should we choose which dictionary to use? It’s easy with bind,
because the specific monad being bound is always given as bind’s
first argument. The bind implementation can be a method on the
monad’s class.
This doesn’t work for return or fail. When they’re called, they
don’t know what kind of monad they should produce. That only becomes
clear later, when their results flow into a bind.
Haskell uses the type system. We have to do it at runtime, on a
case-by-case basis.
A solution
To address these problems, we explicitly represent the dictionary we
need:
The failer, returner and binder methods correspond to the monad
operations we met above, but coercer is new. It lets us dynamically
reinterpret a monad at a different monadic type. We’ll see its use
below.
Now we can define our List monad class:
The not-coercable function is for use when it’s not possible to
reinterpret the type of a monad, which is the overwhelmingly common
case. It checks to make sure we’re already at the expected type, and
raises an exception if we’re not.
We need some way of learning which dictionary to use for a given monad
instance. It’s here that Racket’s generics come in:
This declares a new generic interface, gen:monad, with a single
method, monad->monad-class. It supplies implementations of
gen:monad for the builtin types null? and pair?, in each case
returning the List monad-class when asked.
Now we can implement bind and coerce, not just for a single monad, but for all
monads:
They use monad->monad-class to find the relevant dictionary, extract
the right method from the dictionary, and delegate to it.
Note that bind dispatches on its first argument, while coerce
dispatches on its second argument: monad instances get to decide how
they are bound and how (or if!) they are to be interpreted at other
monad types.
Next, we introduce neutral quasi-monad instances for when return
and fail don’t know which monad type to use. Because we might bind
such a quasi-monad before it collapses into a particular type, we need
to define a quasi-monad to represent binds in superposition, too.
The Return method dictionary looks very similar to the
implementation of the
identity monad,
with the addition of a return-struct box around the carried value:
There are two interesting things to note here:
The bind implementation is a direct statement of one of the monad
laws,
return x >>= f ≡ f x
The coerce implementation rebuilds the
monad using the return implementation from the target monad.
The Return coerce implementation gives us the equivalent of
return-type polymorphism, letting us avoid sprinkling monad type
annotations throughout our code.
The coerce method for an undetermined failure calls the new
monad-class’s failer method to produce the correct representation
for the final monad.
Likewise, coerce for a suspended-bindre-binds its parameters, after first coercing
them into the now-known monad.
You might have noticed a call to coerce in the bind implementation
for List above:
This is necessary because f might simply (return 1), which is one
of these neutral quasi-monad instances that needs to be told what kind
of thing it should be before it can be used. On the other hand, f
might have returned (list 1), in which case List’s coercer method
will notice that no conversion needs to be done.
The IO Monad
Racket is an eager, imperative language, and so doesn’t need an IO
monad the way Haskell does. But we can define one anyway, to see what
it would be like.
It’s also useful as a template for implementing unusual control
structures for monadic embedded domain-specific languages.
Values in the IO type denote sequences of delayed actions. We’ll
represent these using two structure types, one for return and one
for bind:
Lifting Racket’s primitive input/output procedures into our IO type is
straightforward:
Finally, we need a way to actually execute a built-up chain of
actions:
Revisiting the Challenge Problem
We now have all the pieces we need to run the challenge program given
above.
The challenge comes from the end of Oleg’s article on
monadic programming in Scheme. His example uses many different types
of monad together, using the monomorphic technique he developed:
With our polymorphic monads, we are able to use the generic
do-notation macro defined above, and write instead
which compares favourably to the Haskell
Implementation
A somewhat fleshed-out implementation of these ideas is available
here:
It uses
prop:procedure,
making monad-classes directly callable and allowing (coerce List
foo) to be written as (List foo).
It uses the
double-dispatch pattern
to let monad instances flexibly adapt themselves to each other in
coercions. For example, lists, streams and sets can all be
considered interconvertible.
It uses Racket’s built-in exception type to represent pending
failures instead of the simple structure sketched above. This way,
failures may carry stack traces and an informative error message.
It includes a simple State monad, with sget and sput
operations.
Related Work
I based this work on some
experiments I ran back in 2006
in writing a monadic library for Scheme. Racket offers much better
support for object-oriented/generic programming (among many other
things!) than vanilla Scheme does, which makes for smoother
integration between my monad library and the rest of the language.
Ben Wolfson pointed me to his
monad library for clojure in a
reddit thread
about this post. It seems to be using a very similar approach, with
“neutral” carriers for monadic values that stay neutral until a later
point in the program. The main difference seems to be that with my
approach, no “run” step is needed for some monads, since monadic
values carry their bind implementation with them. Ben’s library
supplies the bind implementation at “run” time instead.
Update, 20150127: Paul Khuong pointed me to
his Scheme monad implementation
based around
delimited continuations
in
this twitter conversation.
It doesn’t have dynamic-dispatch for automatically determining the
appropriate bind implementation, but I suspect it could be added;
perhaps in his join operator, which makes an interesting contrast to
the technique explored in this post. The really nice thing about the
delimited continuation approach to monads is that the return is
implicit. You never mention return explicitly in monadic code.
That makes a big difference, because return-type polymorphism is only
needed to handle explicit returns! By placing the return right
next to the appropriate run operation, the delimited continuation
technique neatly sidesteps the problem.
Conclusions
Haskell’s return is generic: it injects a value into any monad at
all. Which specific monad is used depends on the context. This is
impossible to directly implement in general without static analysis.
Lacking such static analyses in dynamically-typed languages, achieving
the equivalent of Haskell’s return is challenging, even though
implementing particular monomorphic variations is straightforward.
The trick is to make return produce a kind of superposition which
doesn’t collapse to a specific monad type until it touches a monad
that already knows what type it is.
In this post, I’ve shown how this kind of monad “coercion” works well
to give the kind of return-type polymorphism we need to make monads
feel in Racket similar to the way they do in Haskell.
No implementations are given for their failer and
returner methods, since they are only quasi-monad classes, and
it’s not possible for user code to ever be in the Fail or Bind
“monads”. ↩
“Giving a professional illustrator a goal for a poster usually
results in what was desired. If one tries this with an artist, one
will get what the artist needed to create that day. Sometimes we
make, to have, sometimes to know and express.” — Alan Kay
I want to set up Postfix SMTP authentication so I have two different
passwords to authenticate with: one main one that gives me access to
both SMTP and IMAP, for my own use with trusted clients, and a
separate one that gives me the ability to send messages via SMTP
only, and that will not work for IMAP.
The reason is that gmail will not let me send mail as an auxiliary
mail identity through its own SMTP servers anymore, and requires a
username and password to authenticate itself to my own SMTP servers.
Since I’m unhappy giving gmail my main email password, I decided a
second gmail-specific password would be a good idea.
This turns out to be difficult.
Postfix will authenticate either against Dovecot, or using Cyrus-SASL.
Cyrus-SASL can talk to a number of things, and one of them is a SQL
database, but it won’t let you use crypt for the passwords stored
in the database. That’s a showstopper there. The other alternative is
to back Cyrus-SASL with PAM, but that involves figuring out PAM.
Painful, and another link in the (already long) chain:
Postfix → Cyrus-SASL → PAM → Database.
I’ve asked on the Freenode #cyrus channel about the possibility of
getting a patch for crypted SQL-database passwords accepted, but no
replies yet.
There are a couple of patches out there that get close to what I want;
the best (but still not quite right) is this
one. It needs work to support other crypt
schemes and to extract salt properly.
Perhaps I will give in and learn how to configure PAM…
Update: At the suggestion of the kind people on #cyrus, I’ve
decided instead to set up a separate account on my email server for
gmail to log into, and simply forward any mail that’s delivered there
(by accident, presumably) to my main account.
Network programming wasn’t so hard to get a basic socket listener up
and running.
Streams are horribly broken though. No clean separation of encoding
from binary I/O. Swank vs raw Slime streams differ, which led to some
interesting issues when experimenting with my I/O code
interactively. flexi-streams help take away some of the pain, but it’s
still pretty awful.
The loop macro is neat. Feels like comprehensions. Comprehensions are
pleasant in Erlang and Haskell, and I remember enjoying SRFI-42, but
Racket misses them1.
The lack of proper tail recursion is HORRIBLE. Forces some very nasty
code. E.g. the tagbody thing.
That everything-is-an-object is GREAT. I wish Racket had this.
Racket’s match is awesome. Totally awesome. Common-lisp pattern
matching is in PARLOUS state. There’s nothing remotely close to as
nice as Racket’s match. There is no common standard.
Exceptions on end-of-stream simplify the programming of network
layers.
2011-10-19 08:06:10 tonyg
The multiple-values implementation is kind of nice, especially in
those places where it enables subtle design touches like using floor
to get both quotient and remainder at once. That particular example
appeals to me also for its mnemonicity: using truncate instead of
floor gives you the other variant2.
It seems clear to me today that I had
been overlooking Racket’s for, for/list, for/hash
etc. I’d still
like to see something more general that abstracts away from the
variations in these, though. ↩
RabbitMQ includes a suite of functional tests, as well as unit tests
for each of its components. It’s not immediately obvious how to run
the functional test suite, but fortunately, it’s straightforward.
You will need:
a Java JDK
Ant
Erlang
git
and I guess a Unix machine of some description; I don’t imagine running these tests on Windows will work well
Running the complete test suite takes about six minutes on my
eight-core, 32GB i7 Linux box. Most of this time is spent sleeping, to
ensure particular interleavings required by particular test cases.
Clone the repositories
Build and start the server in one window
Run the tests in another window
The tests are written in Java and come along with the RabbitMQ Java
client source code.
Be sure to wait until the server has fully initialised itself before
starting ant.
Things should tick along nicely for a few minutes at this point. From
time to time, you’ll see the server print a new banner to its console:
the tests restart the server occasionally as part of their normal
operation.
Check the results
The tests leave their output in rabbitmq-java-client/build/TEST-*
files. Each test suite (FunctionalTests, ClientTests,
ServerTests, HATests) produces both a plain-text and an XML file
summarising the results of the run.
I have parts of my home directory stored in git. In particular, I have
my .gnupg/pubring.gpg and .gnupg/trustdb.gpg files in git. This is
fine so long as absolutely no branching ever takes place.
My question to you, O lazyweb, is: does a tool exist for merging
pubring.gpg etc. in a sensible fashion?
If the answer is “no”, then does a tool exist for exploding the
contents of pubring.gpg etc into a directory full of tiny text files
which are amenable to standard 3-way-merge?
Back in March, I (crudely)
measured
the performance of js-nacl
and js-scrypt running in
the Browser.
Since then, emscripten has improved its code generation
significantly, so I’ve rerun my tests. I’ve also thrown in node.js for
comparison.1
The takeaway is: asm.js makes a big
difference to NaCl and scrypt performance - on Firefox, that
is. Firefox is between 2× and 8× faster.2 Other browsers
have benefited from the improvements, but not as much.
The setup:
Firefox/23.0
Chrome/28.0.1500.95
Safari/534.59.8
node/v0.10.15
Macbook Air late 2010 (3,1), 1.6 GHz Core 2 Duo, 4 GB RAM, OS X 10.6.8
I’m running emscripten at git revision b1eaf55e
(sources),
of 8 August 2013.
(The benchmarks I ran in March were run with rev 4e09482e
(sources)
of 16 Jan 2013.)
What has changed since the last measurements?
Emscripten’s support for asm.js code generation is much better, and I
am also now able to turn on -O2 optimization without things
breaking.
On the minus side, the previous builds included a severe memory leak
(!) because by default Emscripten includes a stub malloc() and
free() implementation that never releases any allocated memory. The
current builds include dlmalloc(), and so no longer leak memory, but
run ever so slightly slower by comparison to using the stub allocator.
Safari seems to have severe problems with the current builds. I’m
unsure where the bug lies (probably emscripten?), but many calls to
crypto_box/crypto_box_open and scrypt() yield incorrect
results. There are missing entries in the charts below because of
this. (No sense in measuring something that isn’t correct.)
Since the previous round, Firefox has gained support for
window.crypto.getRandomValues. Hooray!
Hashing (short) strings/bytes with SHA-512
Firefox handily dominates here, making the others look roughly equivalent.
These are operations whose runtime is dominated by the computation of
a Curve25519 operation. In three of the
four cases, the operation is used to compute a Diffie-Hellman shared
key from a public key and a secret key; in the remaining case
(crypto_box_keypair_from_seed) it is used to compute a public key
from a secret key. Firefox again dominates, but Chrome is not far
behind.
This is one of the areas where Safari yields incorrect results,
leading to a missing data point. I’m not yet sure what the cause is.
This is a thin wrapper over window.crypto.getRandomValues, or the
node.js equivalent, and so has not benefited from the emscripten
improvements. I’m including it just to give a feel for how fast
randomness-generation is.
Safari wins hands-down here. I wonder how good the generated
randomness is?
These are Salsa20/Poly1305 authenticated encryptions using a
precomputed shared key. Broadly speaking, boxing was quicker than
unboxing. Firefox again dominates.
This another of the areas where Safari yields incorrect results. I’m
not yet sure why.
These operations compute an elliptic-curve operation, but use the
result to produce a digital signature instead of an
authenticated/encrypted box. Signature generation is much faster than
signature validation here. As for the other elliptic-curve-heavy
operations, Firefox is fastest, but Chrome is not far behind.
Here, Safari not only underperforms significantly but computes
incorrect results. As above, I’m not sure why.
Firefox is about twice as fast as previously at producing
scrypt()-derived keys. Both Firefox and Chrome are usably fast.
(Approximate speedups since January: Firefox improved by around 2×; the others were roughly unchanged. Chart.)
Conclusions
scrypt is still slow. Safari has problems with this code, or vice
versa. Precompute Diffie-Hellman shared keys where you can.
Emscripten + asm.js = very cool!
Of course, it’s a bit silly to include node.js here,
since it can simply link against
libsodium and get
native-speed, optimized routines. Perhaps a Firefox extension
could include a native XPCOM component offering a similar speed
boost. ↩
The benefit is not quite as much as I
claimed
it was based on eyeballing the numbers. ↩