The Origins of Flow Control and Acknowledgements

Erik Meijer recently explained the duality of Enumeration and Observation.1 Subscribing to a feed of items is dual to retrieving an enumerator and going through the items.

Reading his article, I was surprised to realise that there’s a corresponding symmetry between credit-based flow control and message acknowledgement!

Generalising the idea of polling (enumeration) yields flow control, and generalising subscription (observation) yields acknowledgement.

Flow control and Acknowledgement

When designing network protocols such as TCP or AMQP, two problems arise that are both crucial to the robustness and performance of the system being designed:

  • How do we avoid flooding a receiver with messages? and
  • How do we know when a receiver successfully received a message?

The solutions to these problems are flow control and acknowledgements, respectively. The rest of this article will show how these solutions arise from plain old enumeration and observation. I’ll briefly comment on their manifestations in TCP and AMQP toward the end.

Enumeration and Observation

These message sequence diagrams illustrate the use of .NET’s IEnumerator<T> and IObserver<T> as described in Erik’s article. In each diagram, the application is working with some collection containing two values, 11 and 22.

It’s hard to see the symmetry by looking at the diagrams above. The design of .NET’s IEnumerator<T> API is obscuring it by separating MoveNext from Current.

To really see the symmetry properly, we’ll move to a language where we can use algebra to simplify the problem: SML.

SML types for enumeration and observation

The SML type corresponding to IEnumerator<T> might be written as the following (ignoring exceptions, since they aren’t really relevant to the problem at hand): 2

type 'a enumerator = (unit -> bool) * (unit -> 'a)

This is a pair of methods, (MoveNext, Current), corresponding directly to the methods in .NET’s IEnumerator<T> interface. Since it doesn’t make sense to be able to call Current if MoveNext has returned false, by applying a little domain-knowledge and a little algebra we see that this definition is equivalent to the simpler:

type 'a enumerator = unit -> 'a option

We have combined the two methods together, eliminating the possibility of making a mistake. A NONE result from the single combined method indicates that there are no more values available, and a SOME v result carries the next available value v if there is one.

Translating Erik’s almost-mechanically-derived IObserver<T> interface to SML, again ignoring exceptions, and following the 'a option rewrite we used on enumerator, we arrive at:

type 'a observer = 'a option -> unit

The duality is crystal clear at this point. An enumerator has type unit -> 'a option, where an observer has type 'a option -> unit. Enumerators act as senders of information, and Observers act as receivers.

Here’s a message sequence diagram, equivalent to the one above, but using the SML types instead of the .NET interfaces:

Now the symmetry really stands out.

On the left, we can see that when enumerating, an information-free () message indicates that the receiver is ready for the next piece of information from the sender. This is (credit-based) flow control! 3

On the right, we can see that when observing, an information-free () message indicates that the receiver has finished processing the most-recently-sent piece of information from the sender. This is acknowledgement! 4

Enumerators work by the receiver issuing a unit of flow-control credit, and the sender then sending a single value. Observers work by the sender sending a single value, and the receiver then issuing a single acknowledgement message.

Pipelining credit and acknowledgement messages

Let’s take this a step closer to a real network protocol by introducing pipelining of messages. It’s important to be able to reduce latency by decoupling the issuing of a request from the processing of that request. Here we drop the notational distinction between requests and responses, drawing just one kind of arrow for both kinds of message.

All we’ve done here is moved all the “requests” up to the top of each sequence, and all the “responses” to the bottom.

Remember that on the left, () indicates an issue of credit, so that the sender is permitted to send an additional piece of information to the receiver, while on the right, () indicates acceptance of a preceding piece of information by the receiver.

Batching pipelined credit and acknowledgement messages

We can save some bandwidth by applying some of the algebraic tricks we used earlier, looking for a way of compressing those sequences of () messages into informationally equivalent batch credit-issues and acknowledgement messages.

Recall that our enumerator function takes a single () and returns an 'a option, which means either the next available datum or the end of stream:

type 'a enumerator = unit -> 'a option

Switching from SML to abstract type algebra for a moment, let’s model pipelined requests and responses as lists. The algebraic types involved (don’t forget this introduction to type algebra) are

RequestPipeline = 1 + (1     * RequestPipeline)
ReplyPipeline   = 1 + (Value * ReplyPipeline)

These are equivalent to

RequestPipeline = 1 + 1*1   + 1*1*1       + 1*1*1*1           + ...
ReplyPipeline   = 1 + Value + Value*Value + Value*Value*Value + ...

Now, there’s nothing sensible we can do to compress the reply pipeline in the absence of special knowledge about the sequence of values being transferred, but the request pipeline looks very repetitive. It is, in fact, a unary representation of natural numbers. We can use a binary representation instead to save enormous amounts of bandwidth.

Analogous reasoning lets us compress the stream of acknowledgements leaving an observer, so let’s use natural numbers to model our batched credit-issue messages and acknowledgement messages:

Notice how in each case we’ve replaced three separate information-free messages with a single message carrying a natural number, 3. Pipelining and batching credit and acknowledgement messages improves both latency of delivery and usage of available bandwidth.

Analyzing TCP and AMQP

TCP has an elegant flow- and retransmission-control mechanism, the sliding window. Each byte in a TCP stream is separately numbered so that it can be referred to unambiguously. Each TCP segment contains an ACK field that tells a sender which bytes a receiver has seen so far: in other words, the ACK field acknowledges received data. Each segment also contains a Window Size field that tells the sender how much more data it is currently permitted to send: in other words, the amount of credit the receiver is willing to issue.

Looking at things through the lens of the enumerator/observer duality, it’s clear that TCP makes use of both interfaces simultaneously, since it is both issuing flow credit and acknowledging data receipt. There is some generalisation of both interfaces waiting to be exposed! I hope to write more on this in a future article.

Another interesting aspect of TCP is the way that it numbers its FIN segments as part of the transmitted data stream, ensuring reliable 5 notification of stream termination. This corresponds to the acknowledgement of the final NONE in the sequence diagrams above. You might think that the acknowledgement of the end of a stream was optional, but if you omit it, the reliable-delivery characteristics of the protocol change quite substantially: you’re forced to either introduce an out-of-band reliable shutdown subprotocol, or to fairly drastically change the way timeouts are used to decide how long to retain the necessary state.

AMQP also includes something similar to credit-based flow control and message acknowledgements. Flow credit can be arranged by using a Basic.Qos command before transmission begins, and acknowledgement of received messages is done using Basic.Ack commands. However, the protocol combines replenishment of credit with acknowledgement of data: Basic.Ack has a dual role (pardon the pun). Ideally replenishment of flow credit would be done separately from acknowledgement of data, with a command for each task, mirroring the way TCP uses a separate header field for each function.

Besides AMQP’s core subscription system, centred around Basic.Consume, there is a less-commonly-used polling based method of transferring messages, Basic.Get. Basic.Get is very close to being an instance of the simple SML 'a enumerator function type introduced above, and doesn’t include any way of pipelining or batching requests or responses.

Conclusions and future work

The duality between IEnumerator<T> and IObserver<T> extends into a network protocol setting. By looking at it in a network protocol context, we can see the natural emergence of credit-based flow control on the one hand, and acknowledgement on the other. With this model of a data transfer protocol in mind, we can examine existing protocols such as TCP and AMQP to see how they map to the operations we have identified.

There are one or two remaining loose ends I’m interested in following up. Firstly, what is the generalisation of both interfaces, so evident in both TCP and AMQP, like? Whatever it is, it could turn out to be a good candidate for a generalised model of a flow-controlled reliable information stream. Secondly, both IEnumerator<T> and IObserver<T> are inherently stateful approaches to data streaming. What do we see if we switch to pure-functional models of enumeration and observation? What are the analogous network protocol models like? I suspect that the answers here will lead in the direction of more RESTful models of message transmission such as RestMS and ReverseHTTP.

  1. Erik’s article is available as HTML and also as a PDF, which is the form I originally saw it in. Thanks to @hylomorphism for pointing me toward the article and thus kicking off this interesting train of thought. 

  2. SML code illustrating the use of these enumerator and observer types can be found here

  3. There are other kinds of flow control besides credit-based: XON/XOFF and RTS/CTS signalling will be familiar to those that have used serial links and modems. 

  4. Specifically, transfer of responsibility. The receiver of a message, by sending () to the sender of the message, indicates that it has taken responsibility for the message, and that the sender no longer needs to concern itself with it or attempt further redelivery. There are other kinds of acknowledgement scheme: NAKs, for example, are used in multicast applications to indicate a gap in a received data stream. 

  5. I should really put the word “reliable” in scare quotes… 

Announcers are Networks

Recently I’ve been using Announcements, a framework for a particular style of event pub/sub within a Smalltalk image. Squeak Smalltalk has a handful of different implementations of the same pub/sub pattern in the default image, and Announcements is a nicely designed generalisation of these.

But why would one want to have a particular implementation of such a common design pattern as pub/sub baked in to the image?

One way of thinking about it, that I’m becoming more and more convinced by, is this:

  • If Objects are abstract representations of computers, then
  • Announcers are abstract representations of networks.

The communication channel between two objects is normally a regular message send. Regular message sends provide Plain Old RPC (with Exceptions). Sometimes, however, you want one or more of the following things to happen:

  • a message to be delivered to more than one recipient
  • a message to be delivered to an arbitrary or unknown number of recipients
  • a message to be delivered asynchronously, or without reply
  • a message to be delivered across a real network to some remote object

Combinations of the above are useful and interesting too.

It’s in situations like these that having a first-class representation of the medium of communication between objects becomes important. And that’s just what Announcers are in the Announcement framework.

One of the too-many things I’m thinking about at the moment is how to generalise Announcers to take into consideration more than just the first two of the four points listed above. Designing and implementing the basic facility seems straightforward, but one runs fairly quickly into issues of concurrency and fault-tolerance.

The Weaknesses of Smalltalk are the Strengths of Erlang

Today I stumbled across “Guidelines for Choosing A Computer Language: Support For The Visionary Organization” by Patricia K. Lawlis, written back in 1997.

Appendix Q deals with Smalltalk. Here are the categories in which Smalltalk scores well:

  • Clarity of source code - Rating: 9
  • Complexity management (architecture support) - Rating: 6
  • Maintainability - Rating: 7
  • Object-oriented programming support - Rating: 10
  • Reusability - Rating: 8
  • Support for modern engineering methods 1 - Rating: 7

Here are the categories in which Smalltalk scores poorly:

  • Concurrency support - Rating: 2
  • Distributed system support - Rating: 0
  • Mixed language support - Rating: 3
  • Portability - Rating: 3
  • Real-time support - Rating: 0
  • Reliability - Rating: 3
  • Safety - Rating: 0
  • Standardization - Rating: 3

Those low scores are ripe for correction!

People often speak of Smalltalk as if it were just another programming language. It is that, of course: Smalltalk the language is pure-object-oriented, class-based, memory safe, reflective, and dynamically typed. However the same name, Smalltalk, also denotes a complete operating system in a virtual machine on top of a traditional OS. Smalltalk’s operating-system nature is to blame for its poor performance in the categories shown above.

The (implied) criticisms of the low scores above are absolutely spot on:

  • Smalltalk’s support for concurrency (isolation of processes; management of processes; isolating failures; transactional behaviour; etc.) is very low-level and old fashioned. Exactly what one would expect from an operating system designed in the early 80s.

  • Its support for distributed programming is also not great. (I have a hypothesis about why that is, that I’m researching at the moment.)

  • No isolation between processes makes mixed language support difficult.

  • Reliability and safety are nonexistent in an environment with no isolation boundaries.

  • Smalltalk is a heavily imperative language. Data structures are mutated internally all over the place, giving a straightforward shared-memory concurrency model.

So what can be done? Let’s look at another Application Operating System2, this one designed with fault-tolerance in mind: Erlang/OTP.

  • Erlang’s support for concurrency is very strong. Processes are, by default, isolated from one another, in that they share no mutable state, and a failure in one process cannot affect another process in an unstructured way.

  • Its support for distributed programming is also very strong. Processes communicate by shared-nothing message passing. Scaling from a non-distributed system to a distributed system starts with the simple substitution of a TCP socket for an in-memory message queue.

  • Isolation between processes makes the decision of which language to use to handle received messages within a process an essentially arbitrary one, though in practice Erlang doesn’t support anything other than its built-in functional fragment.

  • Reliability and safety follow from the strong isolation boundaries between system components and from the structuring principles embodied in the OTP libraries.

  • Erlang is a heavily functional language. Data structures are immutable, giving a straightforward shared-nothing message passing concurrency model.

Erlang has its problems, of course: it doesn’t support object-oriented programming other than either awkwardly (processes look a bit like objects) or in a trivial sense (function closures and objects are the same thing). As a result, generic programming (e.g. ignoring whether you have a list of numbers or an array of numbers) is poorly supported, syntactically unpleasant, and/or slow. It also enjoys none of the rich reflective support provided by the Smalltalk virtual machine, which makes the monitoring tools feel ad-hoc and not very general.

My instinct is that a system that supported a combination of Smalltalk’s smooth generic and reflective programming styles and Erlang’s ability to construct robust, concurrent and distributed systems, would make a great foundation for personal computing.

What can be done to Smalltalk’s operating-system nature to improve isolation of components within a running system? Mutability is a key concern, as is failure isolation. Can Smalltalk be refactored into a hybrid Smalltalk-Erlang system?

  1. Bear in mind that the document was written in 1997, so what “modern engineering methods” meant then is unlikely to line up well with what the term means now. 

  2. Joe Armstrong defines Erlang/OTP as an “Application Operating System” on page 9 of his PhD thesis

The 0MQ Transport Layer Specification

The 0MQ team has just released a draft specification of the 0MQ transport layer. It’s a sweet little spec—go take a look! The nice thing about it is that it concentrates on building a flexible transport layer without putting any constraints on the content, routing or meaning of the datagrams it carries.1

Another interesting aspect of it is that messages are structured as a list of segments (frames)—and that structure is exposed to the next layer up! Such a seemingly small decision has large, and favourable, consequences: the next layer up can use the segmented structure of messages to represent many interesting patterns:

  • a split between message envelope and message contents;
  • a split between message headers and message body;
  • a list of addresses to use in routing a message, like the old-school bang paths;
  • a stream of content being delivered in chunks as it becomes available; and so on.

In my opinion, the AMQP spec should have been split up into similarly small, flexible pieces as part of the work leading up to the current 1.0 drafts. In defence of the working group’s current approach, a monolithic spec can make it easier to build a coherent whole; but on the minus side, it can make it so difficult to gain experience with the system being designed that coherence remains out of reach. For instance, it is not now (and probably never will be) possible to write a one-line message sender using bash for AMQP.

  1. Well, that’s not quite true: it describes parts of the lowest levels of a few messaging patterns that the transport has been used for in addition to the transport itself; my opinion is that those pieces should be split out from spec:13 and placed in specifications of their own. 

Declarative Laziness in Smalltalk

Travis Griggs relates a story of being burned by the use of lazy initialization in Smalltalk. It’s an interesting problem, and I sympathise. The way I see it is that the underlying cause is too much reliance on mutable state, and/or not enough declarative specification of intent. This is a general problem in languages like Smalltalk and Java, nothing specific to particular pieces of code anyone writes, of course!

Some other languages and systems supporting lazy initialization (memoization) avoid the problem by detecting circular dependencies. As each initial-value-computing thunk is entered, a bit is set if it isn’t already, or an exception is raised if the bit is already set. Any kind of non-cyclic dependency graph ends up computing its result as expected, and cycles are detected and signalled.

The trick is to declaratively let the system know what your intent is.

What would be a good Smalltalk-like way of letting the system know you mean to build a lazy initializer? A few options spring to mind:

  • one could use a pragma on methods acting the part of lazily-initializing getters;

  • one could mark the initializing closure specially somehow (foo ifNil: [...] lazyInitializer), where the effect of #lazyInitializer is to set up the circularity checks (initial experiments along these lines stalled when I realised I didn’t have enough understanding of the way BlockClosures work in Squeak);

  • one could set foo initially to some object representing a Promised value, which internally has support for the necessary circularity-checks;

and no doubt many other variations.

(Incidentally, I just tried Haskell, to see what it did with

let a = b + 1; b = a - 2 in a

and unfortunately it suffers from the same infinite recursion problem Travis’s code was suffering from: neither a value nor an error is produced.)

Prex 0.9.0 suffers from a serious bug in its exception handling code

In our Systems course here at Northeastern, we’re working with Prex as our experimental platform this semester. A recent assignment called for the implementation of some virtual-memory-like features, using Prex’s exceptions to deliver information about faulting memory accesses to the faulting process, which is thereby given an opportunity to arrange for some memory to back the accessed address.

The problem is in the x86 code implementing system call handling for the exception_return() system call. The logic for detecting that an exception_return() has occurred neglects to take into account the fact that returning from an exception replaces the active context. The fix is to check %eax before calling syscall_handler.

Without the patch below, %eax will be smashed in the context of the code that was interrupted by the exception.

I emailed this patch to the Prex list, but something about my message triggered Sourceforge’s spam detection software, or something equally tedious and annoying, and it hasn’t made its way through yet; so I’m posting this here in the hopes that those in need of it might find it.

Here’s the required change:

--- a/bsp/hal/x86/arch/locore.S
+++ b/bsp/hal/x86/arch/locore.S
@@ -313,11 +313,23 @@ ENTRY(syscall_entry)
	pushl	$(SYSCALL_INT)		/* Trap number */
	SAVE_ALL
	SETUP_SEG
-	call	syscall_handler
+	/* We check the saved value of eax here, before calling syscall_handler,
+	   because if it's zero (i.e. exception_return), we will completely
+	   obliterate the saved registers in context_restore() within
+	   exception_return() within syscall_handler(), which will make the value
+	   of 0x10(%esp) contain whatever random eax value happened to be around
+	   at the time the exception was delivered. For example, if the exn
+	   was a page fault, eax could be anything at all. If we check 0x10(%esp)
+	   after syscall_handler has returned, it will in the general case contain
+	   junk. */
	cmpl	$0, 0x10(%esp)		/* Skip setting eax if exception_return */
	je	1f
+	call	syscall_handler
	movl	%eax, 0x10(%esp)	/* Set return value to eax */
+	jmp	2f
 1:
+	call	syscall_handler
+2:
	call	exception_deliver	/* Check exception */
 syscall_ret:
	RESTORE_ALL

Make it work, make it right, make it fast

I recently, upon reading something somewhere that irked me, tweeted the old adage

1. Make it work 2. Make it right 3. Make it fast. So many projects start at 1 and skip to 3, or worse, start at 3… :-(

In reply, Conal Elliott tweeted

What does it mean to work but not be right?

which is a great question. If a program can be said to be “working”, what can that possibly mean except that it is in some sense a “right” program?1

That’s a pretty decent point. Software is a tool for getting stuff done, after all.

However, I’m not quite ready to abandon my idealism about these things, so let me have a stab at expressing myself more clearly: a “right” program follows the design implied by its own implementation. Implementing programs can lead to improved designs through deeper understanding of the underlying domains.

Now, programs can “work” just fine and yet not follow (or not expose) their own internal logic. It’s when programs are composed that their little warts and kinks start to add up quickly into serious distortions. Let’s revisit the adage:

  • Step 1: make it work;
  • Step 2: make it right;
  • Step 3: make it fast.

I have in my mind the image of two mirrors roughly facing each other (step 1): improving their alignment even slightly (step 2) can cause the tail of reflected images to grow asymptotically longer. Polishing the mirrors (step 3) helps sometimes, but only if they’re in good-enough alignment to begin with.

I don’t care about step 2 for every piece of software. For example, most embedded systems software (microwave controllers, digital watches) is so throwaway and terminal that polishing it beyond a certain point isn’t worth it.2

Foundational software, though, is where a “make it right” step becomes crucial. Software upon which other software is constructed3 has to be made right because of the costs of correcting for design flaws downstream. The total effort involved in working around kinks and warts in a foundational artifact often exceeds the effort required to fix the foundations.4

There are countless examples where an almost-right product has gained wide influence without being “made right”: Unix, whose distortions have led to X-windows and monolithic walled-garden applications; Excel, whose distortions particularly with respect to naming and scope have led to horrible kludges and workarounds in any application beyond the trivially simple; AMQP, whose original model was simple and promising, but which is now moving in the direction of the irredeemably complicated; and on, and on.

In each case, lots (and lots) of effort has been spent on optimizing the current artifact, and because very little effort has been spent on learning more about the domain and revisiting the artifact’s design decisions, even more effort has been spent downstream on workarounds for each system’s design flaws.

Lots of these cases have been successful because the software involved has happened to line up well enough with the underlying domain that new kinds of useful work can be done: getting the foundations closer to being “right” has opened up previously-unimagined spaces for new applications. Getting the foundations “right” really does increase the power and the reach of a system.

  1. Er, on second read, perhaps Conal was being ironic, and saying that a non-right program in the sense of this post cannot be said to work at all! After all, without semantics, programs are meaningless… 

  2. Although even here there are lessons to be learned. An embedded system’s distortion with respect to its own internal logic, if examined carefully, can lead to less-distorted future iterations within a product line, perhaps, and the lessons may even spill over in the course of an engineer’s career into other embedded systems. 

  3. For instance, operating systems, language virtual machines, language designs themselves, important libraries, and networking protocols, etc. 

  4. Which makes it especially important to get network protocol designs right, since flaws and felicities are magnified so strongly by the network effects involved. 

The State of the Art in User Interfaces

This is a dialog that popped up just now from iTunes:

iTunes CD Lookup Results dialog

It has several noteworthy interactive features:

  1. It is not resizable.
  2. The two choices are visually identical.
  3. Copy-and-paste is disabled for the text of interest.

It also possesses a few less visible but no less interesting attributes:

  1. There is no undo once I click OK.
  2. iTunes remembers my choice1 for future insertions of the same CD.

I shall choose at random.

Moral: Current environments for interacting with computers are, putting it mildly, irretrievably broken. Our only hope lies in disruptive innovation.

  1. Fortunately, there is a method for causing iTunes to forget my choice and ask me the same question again: it is the selection Get Track Names in the Advanced menu. 

Rube Goldberg contraptions with RabbitMQ

I’ve just finished building and deploying this website. It uses jekyll to render the content, and the content author uses git to push the changes up to the hosting machine. From there, a nice little chain of programs arranges for the site to be rebuilt on the server and made live:

  • A git post-receive hook uses curl to HTTP POST an empty message into a RabbitMQ exchange via the RabbitHub plugin.

  • D. J. Bernstein’s daemontools supervises an instance of amqp-consume, which connects to a queue bound to the exchange the post-receive hook delivers into, and whenever a message is received, invokes a shell-script. The command-line for invoking amqp-consume is roughly

    amqp-consume \
        -s localhost \
        --username=... --password=... \
        -e exchangename \
        -A \
        /path/to/rebuild-website-script
  • The shell-script invoked for every message from RabbitMQ checks out a fresh copy of the website, compiles it, and deploys the resulting static HTML into the correct location on the file system for Apache to pick up.

  • I’m also monitoring the RabbitMQ exchange using the rabbitmq-xmpp plugin talking to my desktop XMPP client, Adium, so whenever anyone does a git push, I get a message appearing in my IM client from exchangename@my.rabbitmq.hostname letting me know a new version of the site has just gone live.

Trigonometry animation

A few days ago, I chanced across http://i.imgur.com/WKeVH.gif on twitter1. It occurred to me that you could use the same style of diagram to show all three of sin, cos and tan simultaneously. If you project downward at the same time as you project sideways, the two projections will be 90 degrees out-of-phase, just as sine and cosine are.

An hour or so’s hacking in Squeak resulted in the following. (Click here to hide the animation.)

Animation of three trig functions

The source code is available for the curious. (Automatic syntax-highlighting courtesy of github. If the previous link doesn’t work for some reason, a less-pretty but almost certainly available link is this one.)

  1. I haven’t been able to find any attribution for this diagram, I’m afraid. If it’s yours, please let me know!