Mendeley encrypts your data so you can't access it

Recently, the Mendeley bibliography-management and paper-archive tool started encrypting users’ databases of bibliographic data.

Each such database includes annotations and is the product of a user’s careful, hard manual work.

Previously, users were able to access their valuable data using standard tools. Now, that access has been removed.

I have used Mendeley for almost a decade, now, and I value my database and my data very highly.

Therefore, I looked into what exactly Mendeley has done to prevent me accessing my data, and figured out an awkward manual technique for rescuing enough of it to let me switch to a new bibliography management system.

Background

The reasons for the change are unclear, but it’s possible that this user-hostile encryption was put into the product in retaliation for a competing product, Zotero, adding support for importing records from a Mendeley database. Zotero discusses the problem here, stating that

“Mendeley 1.19 and later have begun encrypting the local database, making it unreadable by Zotero and other standard database tools.”

and the project has also tweeted that

“The latest version of Mendeley prevents you from getting all of your data out of the app.”

Certainly, the old technique for accessing the database no longer works:

$ sqlite3 tonygarnockjones@gmail.com@www.mendeley.com.sqlite
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
Error: file is not a database
sqlite>

The new encryption definitely prevents me from getting all of my data from the database I have so carefully constructed.

(Update: Mendeley appears to be claiming that they are required by the GDPR to encrypt local user files! This is a bizarre claim, both to me and to many, many other people.)

The encryption Mendeley is using

Mendeley is using the SQLite Encryption Extension (“SEE”) with a hidden key.

The SEE library is closed-source and very proprietary. Its API is documented, but its on-disk structures are not (publicly) documented, and the source code is not publicly available. Applications using SEE are required to make it impossible to access SEE functionality from outside the application.

For a while, I had hoped that they might have used the open-source sqlcipher library. Unfortunately, they did not. Even more unfortunately, while tooling exists for sqlcipher, and while SEE and sqlcipher are API-compatible, the resulting files are neither binary- nor tooling-compatible.

It is not clear where the key material is coming from—whether the key is per-database, or whether it is a hard-coded key that is part of the Mendeley application binary, or whether it is a key that is retrieved from the Mendeley online API.

The question is ultimately moot, given the closedness of the SEE format and library. Even if I had the key, there is no tooling for making use of it in any other context than the Mendeley application itself.

How to rescue your data

Here’s the technique I used to rescue my data for long enough to allow a Zotero import to work.

Unfortunately, it’s not easy. It works only on Linux, and relies on being able to run gdb to call an internal SQLite SEE routine, sqlite3_rekey_v2.

In the instructions below, modify file paths etc. to line up with your own Mendeley and Unix usernames etc.

  1. Quit Mendeley. You don’t want it running while you’re fiddling with its database.

  2. BACK UP YOUR DATABASE. You will want to put things back the way they were after you’re done so you can use Mendeley again. THE REST OF THIS PROCEDURE MODIFIES THE DATABASE FILE ON DISK in ways that I do not know whether the Mendeley application can handle.

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-encrypted.sqlite
    
  3. Start Mendeley under the control of gdb.

    mendeleydesktop --debug
    
  4. Add a breakpoint that captures the moment a SQLite database is opened.

    (gdb) b sqlite3_open_v2
    
  5. Start the program.

    (gdb) run
    
  6. The program will stop at the breakpoint several times. Keep continuing the program until the string pointed to by $rdi names the file you backed up in the step above.

    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x1dca928:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
    (gdb) c
    Continuing.
    
    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x1dcb318:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
    (gdb) c
    

    (… repeats a few times …)

    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x25f1818:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/tonygarnockjones@gmail.com@www.mendeley.com.sqlite"
    
  7. Now, set a breakpoint for the moment the key is supplied to SEE. We don’t care about the key itself (for reasons discussed above), but we do care to find the moment just after sqlite3_key has returned.

    (gdb) b sqlite3_key
    Breakpoint 2 at 0x101b2c0
    (gdb) c
    Continuing.
    
    Thread 1 "mendeleydesktop" hit Breakpoint 2, 0x000000000101b2c0 in sqlite3_key ()
    (gdb) info registers 
    rax            0x7fffffffc6b0	140737488340656
    rbx            0x25f0590	39781776
    rcx            0x7fffea9a0c40	140737129352256
    rdx            0x20	32
    rsi            0x260fd68	39910760
    rdi            0x25ef4e8	39777512
    rbp            0x7fffffffc730	0x7fffffffc730
    rsp            0x7fffffffc688	0x7fffffffc688
    r8             0xc1	193
    r9             0x7fffea9a0cc0	140737129352384
    r10            0x0	0
    r11            0x1	1
    r12            0x7fffffffc6b0	140737488340656
    r13            0x7fffffffc6a0	140737488340640
    r14            0x7fffffffc790	140737488340880
    r15            0x7fffffffc790	140737488340880
    rip            0x101b2c0	0x101b2c0 <sqlite3_key>
    eflags         0x202	[ IF ]
    cs             0x33	51
    ss             0x2b	43
    ds             0x0	0
    es             0x0	0
    fs             0x0	0
    gs             0x0	0
    
  8. Copy down the value of $rdi from the info registers output. It is the pointer to the open SQLite database handle. Then, finish execution of sqlite3_key.

    (gdb) fin
    Run till exit from #0  0x000000000101b2c0 in sqlite3_key ()
    0x0000000000f94e54 in SqliteDatabase::openInternal(QString const&, SqlDatabaseKey*) ()
    
  9. Use gdb’s ability to call C functions to rekey the database to the null key, thereby decrypting it in-place and allowing Zotero import to do its work.

    Use the value for $rdi you noted down in the previous step as the first argument to sqlite3_rekey_v2, and zero for the other three arguments.

    (gdb) p sqlite3_rekey_v2(0x25ef4e8, 0, 0, 0)
    $1 = 0
    
  10. If you see $1 = 0 from the rekey command, all is well, and you may now use Zotero to import your Mendeley database.

    While this is happening, leave gdb running and don’t touch it! DO NOT QUIT GDB OR RUN MENDELEY while the import is proceeding. Who knows what might happen to your carefully decrypted database if you do!

    In fact, before you start Zotero, you might like to copy your decrypted database to somewhere safe, so you don’t have to do this again:

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-decrypted.sqlite
    
  11. Once the import is complete, quit gdb and terminate the associated partially-initialised Mendeley process.

    (gdb) quit
    A debugging session is active.
    
        Inferior 1 [process 32750] will be killed.
    
    Quit anyway? (y or n) y
    [322:322:0100/000000.491674:ERROR:broker_posix.cc(43)] Invalid node channel message
    
  12. Finally, restore your backed-up copy of the encrypted database, so that Mendeley will continue to run OK.

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp ~/backup-encrypted.sqlite tonygarnockjones@gmail.com@www.mendeley.com.sqlite
    

You can now, if you like, in addition to using your new Zotero database as normal, access the raw contents of your decrypted Mendeley database using the standard SQLite tools, like you could in previous Mendeley versions:

$ sqlite3 ~/backup-decrypted.sqlite 
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
CanonicalDocuments       DocumentZotero           NotDuplicates          
DataCleaner              Documents                Profiles               
DocumentCanonicalIds     EventAttributes          RemoteDocumentNotes    
DocumentContributors     EventLog                 RemoteDocuments        
DocumentDetailsBase      FileHighlightRects       RemoteFileHighlights   
DocumentFields           FileHighlights           RemoteFileNotes        
DocumentFiles            FileNotes                RemoteFolders          
DocumentFolders          FileReferenceCountsView  Resources              
DocumentFoldersBase      FileViewStates           RunsSinceLastCleanup   
DocumentKeywords         Files                    SchemaVersion          
DocumentNotes            Folders                  Settings               
DocumentReferences       Groups                   Stats                  
DocumentTags             HtmlLocalStorage         SyncTokens             
DocumentUrls             ImportHistory            ZoteroLastSync         
DocumentVersion          LastReadStates         
sqlite> 

Conclusion

No user should have to do this to access their data. I’m lucky I have the skills to do it at all.

I’m sad that Mendeley violated my trust this way, but glad I have an exit strategy now.

Squeak Smalltalk TiledMaps package

Today I released a thing I’ve been hacking on called TiledMaps.

http://squeaksource.com/TiledMaps.html

It’s a package for Squeak Smalltalk. It can load and cache static, prerendered map tiles from a variety of sources including OpenStreetMaps, Bing Maps, and so on.

It includes a geocoder query service which maps free-text queries to regions of the planet’s surface. The service can be backed by the Nominatim service (associated with OpenStreetMaps), the Bing geocoder, the Google geocoder, and so on.

Selection of tilesets is independent of selection of geocoder, so you can mix and match.

The package includes a “slippy map” morph called TiledMapMorph, that allows interaction with a map using the mouse. It includes a few hooks for EToys, too, so EToys scripting of the map is possible.

To install it in your Squeak 6.0alpha image, first update to the latest trunk version:

MCMcmUpdater updateFromServer

Then load the TiledMaps package from SqueakSource:

(Installer repository: 'http://squeaksource.com/TiledMaps')
    install: 'ConfigurationOfTiledMaps'

Once it has loaded, open a new TiledMapMorph with

TiledMapMorph new openInWorld

I’ve recorded a short (< 10 min) demo video showing the package in action:

PS. All the Bing services require an API key. You can get one of your own from https://www.bingmapsportal.com/. Included in the package are a few other tile sources and geocoders that need API keys as well - you’ll need to check the websites and terms & conditions for each service you want to use.

PPS. In particular, Google’s terms & conditions explicitly forbid downloading map tiles without going through their Javascript API. The package honours this restriction.

Lying to TCP makes it a best-effort streaming protocol

In this All Things Linguistic post Gretchen and Lauren introduce a model dialogue:

G: “Would you like some coffee?”
L: “Coffee would keep me awake.”

This might mean Lauren wants coffee—or that she doesn’t want coffee! Her reply on its own is ambiguous. It must be interpreted in a wider context. The post digs into the dialogue more deeply.

In linguistics, the Cooperative Principle can be used to set up a frame within which utterances like these can be interpreted.

Network protocols are just the same. Utterances—packets sent along the wire—must be interpreted using a similar assumption of cooperation. And, just as in human conversation, an utterance can have a deeper, more ambiguous meaning than it seems on its face.

Acknowledgements in TCP

In TCP, the “ack” field contains a sequence number that is supposed to be used by a receiving peer to signal the sending peer that bytes up-to-but-not-including that sequence number have been safely received.

If it’s used that way, it makes TCP a reliable, in-order protocol.

Lying to the sending TCP

But imagine a receiver that didn’t very much care about reliability beyond that offered by the underlying transport. For example, a VOIP receiver, where delayed or lost voice packets are useless, and in fact harmful if carried over TCP because they delay packets travelling behind them, causing the whole conversation to get more and more delayed.

That receiver could lie. It could use the “ack” sequence number field to indicate which bytes it no longer cared about.

In effect, it would ignore lost packets, and use the “ack” field to tell the sender not to bother retransmitting them.1

That decision would make TCP a best-effort, in-order protocol.

Is it really a lie?

Seen one way, this abuse of the “ack” field isn’t a lie. In both cases, the receiver doesn’t care about the bytes anymore: on the one hand, because they’ve been received successfully; on the other, because they’re irrelevant now.

It’s similar to the ambiguity about the coffee in the example above.

The surface meaning is clear. The deeper meaning depends on knowledge about the wider context that isn’t carried in the utterance itself.

We can assume cooperation without assuming motivation

The statement “Don’t send me bytes numbered lower than X” can be interpreted either as “I have received all bytes numbered lower than X” or “I no longer care about bytes numbered lower than X”. Only context can tell them apart.

.

.


PS. My dissertation proposes Syndicate, a design for new programming language features for concurrency. Syndicate incorporates ideas about conversation and cooperativity like those discussed here. (See, in particular, chapter 2 of the dissertation.) It’s still early days, but I’ve put up a resource page at http://syndicate-lang.org/tonyg-dissertation/ that has the dissertation itself, a video of the defense talk I gave, and a few other resources.

  1. Of course, I’m ignoring fiddly details like segment boundaries, interactions between this strategy and flow- and congestion-control, and so on. It’d take a lot of work to really do this! 

PhD Dissertation: Conversational Concurrency

I’ve put up a webpage, http://syndicate-lang.org/tonyg-dissertation/, which gathers together a number of resources related to my recently-completed PhD project:

  • The dissertation itself, in both PDF and online HTML;
  • A video recording of my dissertation defense talk;
  • The slides I used for my talk;
  • A source code snapshot of the Syndicate implementations and examples; and
  • The proof scripts accompanying the dissertation.

Syndicate is a programming language design taking a new approach to communication and coordination in a concurrent setting.

It can be seen as a hybrid between Actor-style message-passing and tuplespace-style shared-blackboard communication, with some characteristics of each plus a few unique ones of its own.

Click through for the abstract from the dissertation.

Improving Twitter with a user stylesheet

In this post, I’ll explain how to install the Stylish browser extension and use it to customise the appearance of Twitter with a user stylesheet that:

  • hides “promoted tweets”
  • hides the “trends for you” panel
  • hides “liked” tweets (when you don’t follow the tweeter concerned)

Recently, I tweeted about the Stylish user stylesheet I use to make Twitter bearable again. A few people have asked me for more detail on how to set this up.

Follow the instructions below according to the browser you use; I know how to do this for Firefox and Chrome.

You might need to reload the Twitter page after you’re done to see the effect. You can tell it’s working if the “Trends for you” panel on the left of Twitter is no longer present.

Instructions for Firefox users

  • Install the Stylish addon for Firefox: Stylish for Firefox

  • Once installation is complete, visit your addons page by opening a new tab or window and entering about:addons in the URL bar.

  • Click on the “User Styles” tab in the column on the left.

  • Click the “Write New Style” button.

  • Give it a name. Mine’s called “Hide twitter trends + promoted tweets + likes-as-retweets”.

  • Fill in the CSS rules you want. Here are mine, which you can cut-and-paste:

@-moz-document url-prefix("https://twitter.com/") {
  .Trends, .promoted-tweet, .pixel-promoted-tweet {display:none;}
  #doc.route-home #timeline div[data-you-follow="false"] {display:none;}
  #doc.route-home #timeline div[data-retweeter],
  #doc.route-home #timeline div.my-tweet,
  #doc.route-home #timeline div.conversation-tweet {display:inherit;}
}
  • Click “Save”.

  • Enjoy!

Instructions for Chrome users

  • Install the Stylish extension for Chrome: Stylish for Chrome

  • Once installation is complete, click on the Stylish icon in the toolbar, next to the URL bar on the right.

  • Click on the three dots near the top right of the popup that appears.

  • Click on “Create New Style” in the menu that appears.

  • Give it a name. Mine’s called “Hide twitter trends + promoted tweets + likes-as-retweets”.

  • Set it so that it “Applies to URLs on the domain” twitter.com.

  • Fill in the CSS rules you want. Here are mine, which you can cut-and-paste; note that these are the same as for Firefox, only without the @-moz-document wrapper, which is taken care of by the previous step in Chrome:

.Trends, .promoted-tweet, .pixel-promoted-tweet {display:none;}
#doc.route-home #timeline div[data-you-follow="false"] {display:none;}
#doc.route-home #timeline div[data-retweeter],
#doc.route-home #timeline div.my-tweet,
#doc.route-home #timeline div.conversation-tweet {display:inherit;}
  • Click “Save”.

  • Enjoy!

Mobile Batch Photo Conversion on Android

We take a lot of photos on a Sony Xperia Z3 running Cyanogenmod. Unfortunately, the camera support for that particular model introduces pincushion distortion on every photo it takes:

Wonky Fixed

Animation of the correction of the previous images Notice the curved edges of the picture frames in the first image. The second image has been corrected. The animation shows both images superimposed, so you can see the effect of the distortion more clearly.

We haven’t been able to find an app for automatically correcting the input from the camera as it comes in, so we’ve cobbled together a solution based on Dropbox. It’s crude but effective!

Overview of our solution

Schematic of workflow

The solution is to use a cloud server linked to Dropbox to process images placed in a special Dropbox folder.

On the phone, I choose the pictures I want to send through the repair pipeline, and “share” them to the Dropbox app in the correct folder.

Step-by-step, here’s how it works:

  1. Take a photo on the phone.
  2. Copy the image to a special folder in Dropbox.
  3. Dropbox copies it over to the server, which scans the folder periodically.
  4. A little python program fixes the files and puts them into a different Dropbox folder.

After this process, the repaired images are available via the Dropbox app on the phone.

Details

I use djb’s daemontools to run the little program I wrote that scans Dropbox for JPGs. That program calls out to ImageMagick’s convert program to repair the barrel distortion, and then writes the results back to Dropbox and deletes the input file.

We found the right parameters to use, convert input.jpg -distort barrel '0.0132 -0.07765 0.14683' output.jpg, from this XDA-Developers forum page.

I’ve put the source code for the python program on Github:

https://github.com/tonyg/fix-xperia-z3-barrel-distortion

History of Actors

Yesterday, I presented the history of actors to Christos Dimoulas’ History of Programming Languages class.

Here are the written-out talk notes I prepared beforehand. There is an annotated bibliography at the end.


Introduction

Today I’m going to talk about the actor model. I’ll first put the model in context, and then show three different styles of actor language, including two that aim to be realistic programming systems.

I’m going to draw on a few papers:

  • 2016 survey by Joeri De Koster, Tom Van Cutsem, and Wolfgang De Meuter, for taxonomy and common terminology (SPLASH AGERE 2016)

  • 1990 overview by Gul Agha, who has contributed hugely to the study of actors (Comm. ACM 1990)

  • 1997 operational semantics for actors by Gul Agha, Ian Mason, Scott Smith, and Carolyn Talcott.

  • brief mention of some of the content of the original 1973 paper by Carl Hewitt, Peter Bishop, and Richard Steiger. (IJCAI 1973)

  • Joe Armstrong’s 2003 dissertation on Erlang.

  • 2005 paper on the language E by Mark Miller, Dean Tribble, and Jonathan Shapiro. (TGC 2005)

I’ve put an annotated bibliography at the end of this file.

Actors in context: Approaches to concurrent programming

One way of classifying approaches is along a spectrum of private vs. shared state.

  • Shared memory, threads, and locking: very limited private state, almost all shared
  • Tuple spaces and my own research, Syndicate: some shared, some private and isolated
  • Actors: almost all private and isolated, just enough shared to do routing

Pure functional “data parallelism” doesn’t fit on this chart - it lacks shared mutable state entirely.

The actor model

The actor model, then is

  • asynchronous message passing between entities (actors)
    • with guaranteed delivery
    • addressing of messages by actor identity
  • a thing called the “ISOLATED TURN PRINCIPLE”
    • no shared mutable state; strong encapsulation; no global mutable state
    • no interleaving; process one message at a time; serializes state access
    • liveness; no blocking

In the model, each actor performs “turns” one after the other: a turn is taking a waiting message, interpreting it, and deciding on both a state update for the actor and a collection of actions to take in response, perhaps sending messages or creating new actors.

A turn is a sequential, atomic block of computation that happens in response to an incoming message.

What the actor model buys you:

  • modular reasoning about state in the overall system, and modular reasoning about local state change within an actor, because state is private, and access is serialized; only have to consider interleavings of messages

  • no “deadlocks” or data races (though you can get “datalock” and other global non-progress in some circumstances, from logical inconsistency and otherwise)

  • flexible, powerful capability-based security

  • failure isolation and fault tolerance

It’s worth remarking that actor-like concepts have sprung up several times independently. Hewitt and many others invented and developed actors in the 1970s, but there are two occasions where actors seem to have been independently reinvented, as far as I know.

One is work on a capability-based operating system, KeyKOS, in the 1980s which involved a design very much like Hewitt’s actors, feeding into research which led ultimately to the language E.

The other is work on highly fault-tolerant designs for telephone switches, also in the 1980s, which culminated in the language Erlang.

Both languages are clearly actor languages, and in both cases, apparently the people involved were unaware of Hewitt’s actor model at the time.

Terminology

There are two kinds of things in the actor model: messages, which are data sent across some medium of communication, and actors, which are stateful entities that can only affect each other by sending messages back and forth.

Messages are completely immutable data, passed by copy, which may contain references to other actors.

Each actor has

  • private state; analogous to instance variables
  • an interface; which messages it can respond to

Together, the private state and the interface make up the actor’s behaviour, a key term in the actor literature.

In addition, each actor has

  • a mailbox; inbox, message queue
  • an address; denotes the mailbox

Within this framework, there has been quite a bit of variety in how the model appears as a concrete programming language.

De Koster et al. classify actor languages into

  • The Classic Actor Model (create, send, become)
  • Active Objects (OO with a thread per object; copying of passive data between objects)
  • Processes (raw Erlang; receive, spawn, send)
  • Communicating Event-Loops (no passive data; E; near/far refs; eventual refs; batching)

We see instances of the actor model all around us. The internet’s IP network is one example: hosts are the actors, and datagrams the messages. The web can be seen as another: a URL denotes an actor, and an HTTP request a message. Seen in a certain light, even Unix processes are actor-like (when they’re single threaded, if they only use fds and not shm). It can be used as a structuring principle for a system architecture even in languages like C or C++ that have no hope of enforcing any of the invariants of the model.

Rest of the talk

For the rest of the talk, I’m going to cover the classic actor model using Agha’s presentations as a guide; then I’ll compare it to E, a communicating event-loop actor language, and to Erlang, a process actor language.

Classic Actor Model

The original 1973 actor paper by Hewitt, Bishop and Steiger in the International Joint Conference on Artificial Intelligence, is incredibly far out!

It’s a position paper that lays out a broad and colourful research vision. It’s packed with amazing ideas.

The heart of it is that Actors are proposed as a universal programming language formalism ideally suited to building artificial intelligence.

The goal really was A.I., and actors and programming languages were a means to that end.

It makes these claims that actors bring great benefits in a huge range of areas:

  • foundations of semantics
  • logic
  • knowledge-based programming
  • intentions (software contracts)
  • study of expressiveness of programming languages
  • teaching of computation
  • extensible, modular programming
  • privacy and protection
  • synchronization constructs
  • resource management
  • structured programming
  • computer architecture

(It’s amazingly “full-stack”: computer architecture!?)

In each of these areas, you can see what they were going for. In some, actors have definitely been useful; in others, the results have been much more modest.

In the mid-to-late 70s, Hewitt and his students Irene Greif, Henry Baker, and Will Clinger developed a lot of the basic theory of the actor model, inspired originally by SIMULA and Smalltalk-71. Irene Greif developed the first operational semantics for it as her dissertation work and Will Clinger developed a denotational semantics for actors.

In the late 70s through the 80s and beyond, Gul Agha made huge contributions to the actor theory. His dissertation was published as a book on actors in 1986 and has been very influential. He separated the actor model from its A.I. roots and started treating it as a more general programming model. In particular, in his 1990 Comm. ACM paper, he describes it as a foundation for CONCURRENT OBJECT-ORIENTED PROGRAMMING.

Agha’s formulation is based around the three core operations of the classic actor model:

  • create: constructs a new actor from a template and some parameters
  • send: delivers a message, asynchronously, to a named actor
  • become: within an actor, replaces its behaviour (its state & interface)

The classic actor model is a UNIFORM ACTOR MODEL, that is, everything is an actor. Compare to uniform object models, where everything is an object. By the mid-90s, that very strict uniformity had fallen out of favour and people often worked with two-layer languages, where you might have a functional core language, or an object-oriented core language, or an imperative core language with the actor model part being added in to the base language.

I’m going to give a simplified, somewhat informal semantics based on his 1997 work with Mason, Smith and Talcott. I’m going to drop a lot of details that aren’t relevant here so this really will be simplified.

e  :=  λx.e  |  e e  |  x  | (e,e) | l | ... atoms, if, bool, primitive operations ...
    |  create e
    |  send e e
    |  become e

l  labels, PIDs; we'll use them like symbols here

and we imagine convenience syntax

e1;e2              to stand for   (λdummy.e2) e1
let x = e1 in e2   to stand for   (λx.e2) e1

match e with p -> e, p -> e, ...
    to stand for matching implemented with if, predicates, etc.

We forbid programs from containing literal process IDs.

v  :=  λx.e  |  (v,v)  |  l

R  :=  []  |  R e  |  v R  |  (R,e)  |  (v,R)
    |  create R
    |  send R e  |  send v R
    |  become R

Configurations are a pair of a set of actors and a multiset of messages:

m  :=  l <-- v

A  :=  (v)l  |  [e]l

C  :=  < A ... | m ... >

The normal lambda-calculus-like reductions apply, like beta:

    < ... [R[(λx.e) v]]l ... | ... >
--> < ... [R[e{v/x}  ]]l ... | ... >

Plus some new interesting ones that are actor specific:

    < ... (v)l ...    | ... l <-- v' ... >
--> < ... [v v']l ... | ...          ... >

    < ... [R[create v]]l       ... | ... >
--> < ... [R[l'      ]]l (v)l' ... | ... >   where l' fresh

    < ... [R[send l' v]]l ... | ... >
--> < ... [R[l'       ]]l ... | ... l' <-- v >

    < ... [R[become v]]l       ... | ... >
--> < ... [R[nil     ]]l' (v)l ... | ... >   where l' fresh

Whole programs e are started with

< [e]a | >

where a is an arbitrary label.

Here’s an example - a mutable cell.

Cell = λcontents . λmessage . match message with
                                (get, k) --> become (Cell contents);
                                             send k contents
                                (put, v) --> become (Cell v)

Notice that when it gets a get message, it first performs a become in order to quickly return to ready state to handle more messages. The remainder of the code then runs alongside the ready actor. Actions after a become can’t directly affect the state of the actor anymore, so even though we have what looks like multiple concurrent executions of the actor, there’s no sharing, and so access to the state is still serialized, as needed for the isolated turn principle.

< [let c = create (Cell 0) in
   send c (get, create (λv . send c (put, v + 1)))]a | >

< [let c = l1 in
   send c (get, create (λv . send c (put, v + 1)))]a
  (Cell 0)l1 | >

< [send l1 (get, create (λv . send l1 (put, v + 1)))]a
  (Cell 0)l1 | >

< [send l1 (get, l2)]a
  (Cell 0)l1
  (λv . send l1 (put, v + 1))l2 | >

< [l1]a
  (Cell 0)l1
  (λv . send l1 (put, v + 1))l2 | l1 <-- (get, l2) >

< [l1]a
  [Cell 0 (get, l2)]l1
  (λv . send l1 (put, v + 1))l2 | >

< [l1]a
  [become (Cell 0); send l2 0]l1
  (λv . send l1 (put, v + 1))l2 | >

< [l1]a
  [send l2 0]l3
  (Cell 0)l1
  (λv . send l1 (put, v + 1))l2 | >

< [l1]a
  [l2]l3
  (Cell 0)l1
  (λv . send l1 (put, v + 1))l2 | l2 <-- 0 >

< [l1]a
  [l2]l3
  (Cell 0)l1
  [send l1 (put, 0 + 1)]l2 | >

< [l1]a
  [l2]l3
  (Cell 0)l1
  [l1]l2 | l1 <-- (put, 1) >

< [l1]a
  [l2]l3
  [Cell 0 (put, 1)]l1
  [l1]l2 | >

< [l1]a
  [l2]l3
  [become (Cell 1)]l1
  [l1]l2 | >

< [l1]a
  [l2]l3
  [nil]l4
  (Cell 1)l1
  [l1]l2 | >

(You could consider adding a garbage collection rule like

    < ... [v]l ... | ... >
--> < ...      ... | ... >

to discard the final value at the end of an activation.)

Because at this level all the continuations are explicit, you can encode patterns other than sequential control flow, such as fork-join.

For example, to start two long-running computations in parallel, and collect the answers in either order, multiplying them and sending the result to some actor k’, you could write

let k = create (λv1 . become (λv2 . send k' (v1 * v2))) in
send task1 (req1, k);
send task2 (req2, k)

Practically speaking, both Hewitt’s original actor language, PLASMA, and the language Agha uses for his examples in the 1990 paper, Rosette, have special syntax for ordinary RPC so the programmer needn’t manipulate continuations themselves.

So that covers the classic actor model. Create, send and become. Explicit use of actor addresses, and lots and lots of temporary actors for inter-actor RPC continuations.

Before I move on to Erlang: remember right at the beginning I told you the actor model was

  • asynchronous message passing, and
  • the isolated turn principle

?

The isolated turn principle requires liveness - you’re not allowed to block indefinitely while responding to a message!

But here, we can:

let d = create (λc . send c c) in
send d d

Compare this with

letrec beh = (λc . become beh; send c c) in
let d = create beh in
send d d

These are both degenerate cases, but in different ways: the first becomes inert very quickly and the actor d is never returned to an idle/ready state, while the second spins uselessly forever.

Other errors we could make would be to fail to send an expected reply to a continuation.

One thing the semantics here rules out is interaction with other actors before doing a become; there’s no way to have a waiting continuation perform the become, because by that time you’re in a different actor. In this way, it sticks to the very letter of the Isolated Turn Principle, by forbidding “blocking”, but there are other kinds of things that can go wrong to destroy progress.

Even if we require our behaviour functions to be total, we can still get global nontermination.

So saying that we “don’t have deadlock” with the actor model is very much oversimplified, even at the level of simple formal models, let alone when it comes to realistic programming systems.

In practice, programmers often need “blocking” calls out to other actors before making a state update; with the classic actor model, this can be done, but it is done with a complicated encoding.

Erlang: Actors for fault-tolerant systems

Erlang is an example of what De Koster et al. call a Process-based actor language.

It has its origins in telephony, where it has been used to build telephone switches with fabled “nine nines” of uptime. The research process that led to Erlang concentrated on high-availability, fault-tolerant software. The reasoning that led to such an actor-like system was, in a nutshell:

  • Programs have bugs. If part of the program crashes, it shouldn’t corrupt other parts. Hence strong isolation and shared-nothing message-passing.

  • If part of the program crashes, another part has to take up the slack, one way or another. So we need crash signalling so we can detect failures and take some action.

  • We can’t have all our eggs in one basket! If one machine fails at the hardware level, we need to have a nearby neighbour that can smoothly continue running. For that redundancy, we need distribution, which makes the shared-nothing message passing extra attractive as a unifying mechanism.

Erlang is a two-level system, with a functional language core equipped with imperative actions for asynchronous message send and spawning new processes, like Agha’s system.

The difference is that it lacks become, and instead has a construct called receive.

Erlang actors, called processes, are ultra lightweight threads that run sequentially from beginning to end as little functional programs. As it runs, no explicit temporary continuation actors are created: any time it uses receive, it simply blocks until a matching message appears.

After some initialization steps, these programs typically enter a message loop. For example, here’s a mutable cell:

mainloop(Contents) ->
  receive
    {get, K} -> K ! Contents,
                mainloop(Contents);
    {put, V} -> mainloop(V)
  end.

A client program might be

Cell = spawn(fun mainloop(0) end),
Cell ! {get, self()},
receive
  V -> ...
end.

Instead of using become, the program performs a tail call which returns to the receive statement as the last thing it does.

Because receive is a statement like any other, Erlang processes can use it to enter substates:

mainloop(Service) ->
  receive
    {req, K} -> Service ! {subreq, self()},
                receive
                  {subreply, Answer} -> K ! {reply, Answer},
                                        mainloop(Service)
                end
  end.

While the process is blocked on the inner receive, it only processes messages matching the patterns in that inner receive, and it isn’t until it does the tail call back to mainloop that it starts waiting for req messages again. In the meantime, non-matched messages queue up waiting to be received later.

This is called “selective receive” and it is difficult to reason about. It doesn’t quite violate the letter of the Isolated Turn Principle, but it comes close. (become can be used in a similar way.)

The goal underlying “selective receive”, namely changing the set of messages one responds to and temporarily ignoring others, is important for the way people think about actor systems, and a lot of research has been done on different ways of selectively enabling message handlers. See Agha’s 1990 paper for pointers toward this research.

One unique feature that Erlang brings to the table is crash signalling. The jargon is “links” and “monitors”. Processes can ask the system to make sure to send them a message if a monitored process exits. That way, they can perform RPC by

  • monitoring the server process
  • sending the request
  • if a reply message arrives, unmonitor the server process and continue
  • if an exit message arrives, the service has crashed; take some corrective action.

This general idea of being able to monitor the status of some other process was one of the seeds of my own research and my language Syndicate.

So while the classic actor model had create/send/become as primitives, Erlang has spawn/send/receive, and actors are processes rather than event-handler functions. The programmer still manipulates references to actors/processes directly, but there are far fewer explicit temporary actors created compared to the “classic model”; the ordinary continuations of Erlang’s functional fragment take on those duties.

E: actors for secure cooperation

The last language I want to show you, E, is an example of what De Koster et al. call a Communicating Event-Loop language.

E looks and feels much more like a traditional object-oriented language to the programmer than either of the variations we’ve seen so far.

def makeCell (var contents) {
    def getter {
        to get() { return contents }
    }
    def setter {
        to set(newContents) {
            contents := newContents
        }
    }
    return [getter, setter]
}

The mutable cell in E is interestingly different: It yields two values. One specifically for setting the cell, and one for getting it. E focuses on security and securability, and encourages the programmer to hand out objects that have the minimum possible authority needed to get the job done. Here, we can safely pass around references to getter without risking unauthorized update of the cell.

E uses the term “vat” to describe the concept closest to the traditional actor. A vat has not only a mailbox, like an actor, but also a call stack, and a heap of local objects. As we’ll see, E programmers don’t manipulate references to vats directly. Instead, they pass around references to objects in a vat’s heap.

This is interesting because in the actor model messages are addressed to a particular actor, but here we’re seemingly handing out references to something finer grained: to individual objects within an actor or vat.

This is the first sign that E, while it uses the basic create/send/become-style model at its core, doesn’t expose that model directly to the programmer. It layers a special E-specific protocol on top, and only lets the programmer use the features of that upper layer protocol.

There are two kinds of references available: near refs, which definitely point to objects in the local vat, and far refs, which may point to objects in a different vat, perhaps on another machine.

To go with these two kinds of refs, there are two kinds of method calls: immediate calls, and eventual calls.

receiver.method(arg, …)

receiver <- method(arg, …)

It is an error to use an immediate call on a ref to an object in a different vat, because it blocks during the current turn while the answer is computed. It’s OK to use eventual calls on any ref at all, though: it causes a message to be queued in the target vat (which might be our own), and a promise is immediately returned to the caller.

The promise starts off unresolved. Later, when the target vat has computed and sent a reply, the promise will become resolved. A nifty trick is that even an unresolved promise is useful: you can pipeline them. For example,

def r1 := x <- a()
def r2 := y <- b()
def r3 := r1 <- c(r2)

would block and perform multiple network round trips in a traditional simple RPC system; in E, there is a protocol layered on top of raw message sending that discusses promise creation, resolution and use. This protocol allows the system to send messages like

"Send a() to x, and name the resulting promise r1"
"Send b() to y, and name the resulting promise r2"
"When r1 is known, send c(r2) to it"

Crucial here is that the protocol, the language of discourse between actors, allows the expression of concepts including the notion of a send to happen at a future time, to a currently-unknown recipient.

The protocol and the E vat runtimes work together to make sure that messages get to where they need to go efficiently, even in the face of multiple layers of forwarding.

Each turn of an E vat involves taking one message off the message queue, and dispatching it to the local object it denotes. Immediate calls push stack frames on the stack as usual for object-oriented programming languages; eventual calls push messages onto the message queue. Execution continues until the stack is empty again, at which point the turn concludes and the next turn starts.

One interesting problem with using promises to represent cross-vat interactions is how to do control flow. Say we had

def r3 := r1 <- c(r2)  // from earlier
if (r3) {
  myvar := a();
} else {
  myvar := b();
}

By the time we need to make a decision which way to go, the promise r3 may not yet be resolved. E handles this by making it an error to depend immediately on the value of a promise for control flow; instead, the programmer uses a when expression to install a handler for the event of the resolution of the promise:

when (r3) -> {
  if (r3) {
    myvar := a();
  } else {
    myvar := b();
  }
}

The test of r3 and the call to a() or b() and the update to myvar are delayed until r3 has resolved to some value.

This looks like it violates the Isolated Turn Principle! It seems like we now have some kind of interleaving. But what’s going on under the covers is that promise resolution is done with an incoming message, queued as usual, and when that message’s turn comes round, the when clause will run.

Just like with classic actors and with Erlang, managing these multiple-stage, stateful interactions in response to some incoming message is generally difficult to do. It’s a question of finding a balance between the Isolated Turn Principle, and its commitment to availability, and encoding the necessary state transitions without risking inconsistency or deadlock.

Turning to failure signalling, E’s vats are not just the units of concurrency but also the units of partial failure. An uncaught exception within a vat destroys the whole vat and invalidates all remote references to objects within it.

While in Erlang, processes are directly notified of failures of other processes as a whole, E can be more fine-grained. In E, the programmer has a convenient value that represents the outcome of a transaction: the promise. When a vat fails, or a network problem arises, any promises depending on the vat or the network link are put into a special state: they become broken promises. Interacting with a broken promise causes the contained exception to be signalled; in this way, broken promises propagate failure along causal chains.

If we look under the covers, E seems to have a “classic style” model using create/send/become to manage and communicate between whole vats, but these operations aren’t exposed to the programmer. The programmer instead manipulates two-part “far references” which denote a vat along with an object local to that vat. Local objects are created frequently, like in regular object-oriented languages, but vats are created much less frequently; and each vat’s stack takes on duties performed in “classic” actor models by temporary actors.

Conclusion

I’ve presented three different types of actor language: the classic actor model, roughly as formulated by Agha et al.; the process actor model, represented by Erlang; and the communicating event-loop model, represented by E.

The three models take different approaches to reconciling the need to have structured local data within each actor in addition to the more coarse-grained structure relating actors to each other.

The classic model makes everything an actor, with local data largely deemphasised; Erlang offers a traditional functional programming model for handling local data; and E offers a smooth integration between an imperative local OO model and an asynchronous, promise-based remote OO model.


Annotated Bibliography

Early work on actors

  • 1973. Carl Hewitt, Peter Bishop, and Richard Steiger, “A universal modular ACTOR formalism for artificial intelligence,” in Proc. International Joint Conference on Artificial Intelligence

    This paper is a position paper from which we can understand the motivation and intentions of the research into actors; it lays out a very broad and colourful research vision that touches on a huge range of areas from computer architecture up through programming language design to teaching of computer science and approaches to artificial intelligence.

    The paper presents a uniform actor model; compare and contrast with uniform object models offered by (some) OO languages. The original application of the model is given as PLANNER-like AI languages.

    The paper claims benefits of using the actor model in a huge range of areas:

    • foundations of semantics
    • logic
    • knowledge-based programming
    • intentions (contracts)
    • study of expressiveness
    • teaching of computation
    • extensible, modular programming
    • privacy and protection
    • synchronization constructs
    • resource management
    • structured programming
    • computer architecture

    The paper sketches the idea of a contract (called an “intention”) for ensuring that invariants of actors (such as protocol conformance) are maintained; there seems to be a connection to modern work on “monitors” and Session Types. The authors write:

    The intention is the CONTRACT that the actor has with the outside world.

    Everything is super meta! Actors can have intentions! Intentions are actors! Intentions can have intentions! The paper presents the beginnings of a reflective metamodel for actors. Every actor has a scheduler and an “intention”, and may have monitors, a first-class environment, and a “banker”.

    The paper draws an explicit connection to capabilities (in the security sense); Mark S. Miller at http://erights.org/history/actors.html says of the Actor work that it included “prescient” statements about Actor semantics being “a basis for confinement, years before Norm Hardy and the Key Logic guys”, and remarks that “the Key Logic guys were unaware of Actors and the locality laws at the time, [but] were working from the same intuitions.”

    There are some great slogans scattered throughout, such as “Control flow and data flow are inseparable”, and “Global state considered harmful”.

    The paper does eventually turn to a more nuts-and-bolts description of a predecessor language to PLASMA, which is more fully described in Hewitt 1976.

    When it comes to formal reasoning about actor systems, the authors here define a partial order - PRECEDES - that captures some notion of causal connection. Later, the paper makes an excursion into epistemic modal reasoning.

    Aside: the paper discusses continuations; Reynolds 1993 has the concept of continuations as firmly part of the discourse by 1973, having been rediscovered in a few different contexts in 1970-71 after van Wijngaarden’s 1964 initial description of the idea. See J. C. Reynolds, “The discoveries of continuations,” LISP Symb. Comput., vol. 6, no. 3–4, pp. 233–247, 1993.

    In the “acknowledgements” section, we see:

    Alan Kay whose FLEX and SMALL TALK machines have influenced our work. Alan emphasized the crucial importance of using intentional definitions of data structures and of passing messages to them. This paper explores the consequences of generalizing the message mechanism of SMALL TALK and SIMULA-67; the port mechanism of Krutar, Balzer, and Mitchell; and the previous CALL statement of PLANNER-71 to a universal communications mechanism.

  • 1975. Irene Greif. PhD dissertation, MIT EECS.

    Specification language for actors; per Baker an “operational semantics”. Discusses “continuations”.

  • 1976. C. Hewitt, “Viewing Control Structures as Patterns of Passing Messages,” AIM-410

    AI focus; actors as “agents” in the AI sense; recursive decomposition: “each of the experts can be viewed as a society that can be further decomposed in the same way until the primitive actors of the system are reached.”

    We are investigating the nature of the communication mechanisms […] and the conventions of discourse

    More concretely, examines “how actor message passing can be used to understand control structures as patterns of passing messages”.

    […] there is no way to decompose an actor into its parts. An actor is defined by its behavior; not by its physical representation!

    Discusses PLASMA (“PLAnner-like System Modeled on Actors”), and gives a fairly detailed description of the language in the appendix. Develops “event diagrams”.

    Presents very Schemely factorial implementations in recursive and iterative (tail-recursive accumulator-passing) styles. During discussion of the iterative factorial implementation, Hewitt remarks that n is not closed over by the loop actor; it is “not an acquaintance” of loop. Is this the beginning of the line of thinking that led to Clinger’s “safe-for-space” work?

    Everything is an actor, but some of the actors are treated in an awfully structural way: the trees, for example, in section V.4 on Generators:

    (non-terminal:
      (non-terminal: (terminal: A) (terminal: B))
      (terminal: C))
    

    Things written with this keyword: notation look like structures. Their reflections are actors, but as structures, they are subject to pattern-matching; I am unsure how the duality here was thought of by the principals at the time, but see the remarks regarding “packages” in the appendix.

  • 1977. C. Hewitt and H. Baker, “Actors and Continuous Functionals,” MIT A.I. Memo 436A

    Some “laws” for communicating processes; “plausible restrictions on the histories of computations that are physically realizable.” Inspired by physical intuition, discusses the history of a computation in terms of a partial order of events, rather than a sequence.

    The actor model is a formalization of these ideas [of Simula/Smalltalk/CLU-like active data processing messages] that is independent of any particular programming language.

    Instances of Simula and Smalltalk classes and CLU clusters are actors”, but they are non-concurrent. The actor model is broader, including concurrent message-passing behaviour.

    Laws about (essentially) lexical scope. Laws about histories (finitude of activation chains; total ordering of messages inbound at an actor; etc.), including four different ordering relations. “Laws of locality” are what Miller was referring to on that erights.org page I mentioned above; very close to the capability laws governing confinement.

    Steps toward denotational semantics of actors.

  • 1977. Russell Atkinson and Carl Hewitt, “Synchronization in actor systems,” Proc. 4th ACM SIGACT-SIGPLAN Symp. Princ. Program. Lang., pp. 267–280.

    Introduces the concept of “serializers”, a “generalization and improvement of the monitor mechanism of Brinch-Hansen and Hoare”. Builds on Greif’s work.

  • 1981. Will Clinger’s PhD dissertation. MIT

Actors as Concurrent Object-Oriented Programming

  • 1986. Gul Agha’s book/dissertation.

  • 1990. G. Agha, “Concurrent Object-Oriented Programming,” Commun. ACM, vol. 33, no. 9, pp. 125–141

    Agha’s work recast the early “classic actor model” work in terms of concurrent object-oriented programming. Here, he defines actors as “self-contained, interactive, independent components of a computing system that communicate by asynchronous message passing”, and gives the basic actor primitives as create, send to, and become. Examples are given in the actor language Rosette.

    This paper gives an overview and summary of many of the important facets of research on actors that had been done at the time, including brief discussion of: nondeterminism and fairness; patterns of coordination beyond simple request/reply such as transactions; visualization, monitoring and debugging; resource management in cases of extremely high levels of potential concurrency; and reflection.

    The customer-passing style supported by actors is the concurrent generalization of continuation-passing style supported in sequential languages such as Scheme. In case of sequential systems, the object must have completed processing a communication before it can process another communication. By contrast, in concurrent systems it is possible to process the next communication as soon as the replacement behavior for an object is known.

    Note that the sequential-seeming portions of the language are defined in terms of asynchronous message passing and construction of explicit continuation actors.

  • 1997. G. A. Agha, I. A. Mason, S. F. Smith, and C. L. Talcott, “A Foundation for Actor Computation,” J. Funct. Program., vol. 7, no. 1, pp. 1–72

    Long paper that carefully and fully develops an operational semantics for a concrete actor language based on lambda-calculus. Discusses various equivalences and laws. An excellent starting point if you’re looking to build on a modern approach to operational semantics for actors.

Erlang: Actors from requirements for fault-tolerance / high-availability

  • 2003. J. Armstrong, “Making reliable distributed systems in the presence of software errors,” Royal Institute of Technology, Stockholm

    A good overview of Erlang: the language, its design intent, and the underlying philosophy. Includes an evaluation of the language design.

E: Actors from requirements for secure interaction

  • 2005. M. S. Miller, E. D. Tribble, and J. Shapiro, “Concurrency Among Strangers,” in Proc. Int. Symp. on Trustworthy Global Computing, pp. 195–229.

    As I summarised this paper for a seminar class on distributed systems: “The authors present E, a language designed to help programmers manage coordination of concurrent activities in a setting of distributed, mutually-suspicious objects. The design features of E allow programmers to take control over concerns relevant to distributed systems, without immediately losing the benefits of ordinary OO programming.”

    E is a canonical example of the “communicating event loops” approach to Actor languages, per the taxonomy of the survey paper listed below. It combines message-passing and isolation in an interesting way with ordinary object-oriented programming, giving a two-level language structure that has an OO flavour.

    The paper does a great job of explaining the difficulties that arise when writing concurrent programs in traditional models, thereby motivating the actor model in general and the features of E in particular as a way of making the programmer’s job easier.

Taxonomy of actors

  • 2016. J. De Koster, T. Van Cutsem, and W. De Meuter, “43 Years of Actors: A Taxonomy of Actor Models and Their Key Properties,” Software Languages Lab, Vrije Universiteit Brussel, VUB-SOFT-TR-16-11

    A very recent survey paper that offers a taxonomy for classifying actor-style languages. At its broadest, actor languages are placed in one of four groups:

    • The Classic Actor Model (create, send, become)
    • Active Objects (OO with a thread per object; copying of passive data between objects)
    • Processes (raw Erlang; receive, spawn, send)
    • Communicating Event-Loops (E; near and far references; eventual references; batching)

    Different kinds of “futures” or “promises” also appear in many of these variations in order to integrate asynchronous message reception with otherwise-sequential programming.

A Universal Modular ACTOR Formalism for Artificial Intelligence

I have scanned the original IJCAI 1973 proceedings print version of

Carl Hewitt, Peter Bishop, and Richard Steiger, “A universal modular ACTOR formalism for artificial intelligence,” in Proc. International Joint Conference on Artificial Intelligence, 1973, pp. 235–245.

The scanned PDF is available to download for anyone interested in seeing the paper as it looked in print.


As part of my research, I’ve been digging through the literature. I couldn’t find a good PDF version of the above, which is one of the original papers on Actors. All I could find were links to this OCRed version, which doesn’t show me the actual paper, just an ugly rendition of the OCRed text. Even the IJCAI official online archive only has the OCRed version.

So I went looking for the original print IJCAI 1973 proceedings in the university library. Lo and behold, it was there!

So I have scanned it and uploaded it here.

Patching gnome-flashback 3.20 to work with GNOME 3.21

I’m running Debian testing, with gnome-flashback. At present, it has installed gnome-flashback 3.20.2-1 alongside gnome-settings-daemon 3.21.90-2.

Symptoms

By default, there are some problems:

  • The “Displays” section of the config tool says only “Could not get screen information”. This prevents GUI access to a lot of functionality, including arrangement of multiple monitors.

  • The brightness adjustment keys do not work.

The root of the problem is that the DBus interface org.gnome.Mutter.DisplayConfig changed between GNOME 3.20 and GNOME 3.21; the GetResources method produces an additional result in 3.21 that is not present in 3.20’s version of the same interface.1

If you see messages similar to the following from stderr of gnome-settings-daemon at startup, you are suffering from this problem too:

(gnome-settings-daemon:7149): power-plugin-WARNING **: Could not create GnomeRRScreen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): wacom-plugin-WARNING **: Failed to create GnomeRRScreen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): color-plugin-WARNING **: failed to get screens: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'
(gnome-settings-daemon:7149): common-plugin-WARNING **: Failed to construct RR screen: Method 'GetResources' returned type '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuud)ii)', but expected '(ua(uxiiiiiuaua{sv})a(uxiausauaua{sv})a(uxuudu)ii)'

Fixing the problem

To fix the problem, I patched and rebuilt gnome-flashback with these commands:

sudo apt-get build-dep gnome-flashback
apt-get source gnome-flashback
wget "https://eighty-twenty.org/files/gnome-flashback-hack-20160918.patch"
patch -p0 < gnome-flashback-hack-20160918.patch
(cd gnome-flashback-3.20.2; fakeroot ./debian/rules binary)

After these steps, you should have a new gnome-flashback binary, gnome-flashback-3.20.2/gnome-flashback/gnome-flashback. You can now move the existing one out of the way and install it:

sudo mv /usr/bin/gnome-flashback /usr/bin/gnome-flashback-AS-INSTALLED
sudo cp gnome-flashback-3.20.2/gnome-flashback/gnome-flashback /usr/bin
sudo chown root:root /usr/bin/gnome-flashback

Now restart both gnome-flashback and gnome-settings-daemon. (You should arrange for gnome-flashback to be started after gnome-settings-daemon.)

You should no longer have the stderr output complaining about the RR screen DBus signatures, and both the “Displays” section of the config tool and the brightness keys should work.

  1. It’s a shame that there’s no apparent protocol versioning in the GNOME system, or at least none that applies here. If DBus had a serialization language a little more like protobufs, it’d be possible to smoothly add fields in a backwards-compatible way. 

Extensible Double Dispatch for Racket

Both Racket’s object system and its (separate!) generic interface system offer single-dispatch object-oriented programming: the choice of method body to execute depends on the type of just one of the arguments given to the method, usually the first one.

In some cases, the first thing that a method will do is to decide what to do next based on the type of a second argument. This is called double dispatch, and it has a long history in object-oriented programming languages—at least as far back as the original Smalltalk.

As an example, consider implementing addition for classes representing numbers. A different method body would be needed for each pair of representations of numbers.

I stumbled across the need for something like this when implementing Operational Transformation (OT) for Racket. The macro operation-transformer in that code base is almost the double-dispatch macro from this post; the difference is that for operational transformation, the method concerned yields two results, and if the arguments are switched on the way in, they must be switched on the way out.

Basic Double Dispatch

Here’s a basic double-dispatch macro:

(define-syntax-rule (double-dispatch op (arg1 arg2) [pred? body ...] ...)
  (cond
    [(pred? arg2) body ...] ...
    [else (error 'op "Unimplemented for ~v and ~v" arg1 arg2)]))

It assumes that it will be used in a method where dispatch has already been done on arg1, and that the next step is to inspect arg2. It applies the pred?s in sequence until one of them answers true, and then evaluates the corresponding body. If none of the pred?s hold, it signals an error.

It’s often convenient to use it inside a class definition or generic interface implementation with the following macros, which simply define op to delegate immediately to double-dispatch. The first is to be used with Racket’s object system, where the first argument is bound implicitly to this and where predicates should use Racket’s is-a? function. The second is to be used with Racket’s generic interface system, where both arguments are explicitly specified and predicates are more general.

(define-syntax-rule (define/public/double-dispatch (op arg2) [class body ...] ...)
  (define/public (op arg2)
    (double-dispatch (lambda (a b) (send a op b)) (this arg2)
      [(lambda (v) (is-a? v class)) body ...] ...)))

(define-syntax-rule (define/double-dispatch (op arg1 arg2) [pred? body ...] ...)
  (define (op arg1 arg2)
    (double-dispatch op (arg1 arg2) [pred? body ...] ...)))

Commutative Double Dispatch

For commutative operations like addition, it’s common to see the same code appear for adding an A to a B as for adding a B to an A.

The next macro automatically flips its arguments and tries again to see if B’s method has support for A, if it can’t find support for B within A’s method. That way, code for combining B with A need only be supplied in one place. It uses a parameter to keep track of whether it’s currently trying out a flipped pair of arguments.

(define trying-flipped? (make-parameter #f))

(define-syntax-rule (commutative-double-dispatch op (arg1 arg2) [pred? body ...] ...)
  (cond
    [(pred? arg2) (parameterize ((trying-flipped? #f)) body ...)] ...
    [(trying-flipped?) (error 'op "Unimplemented for ~v and ~v" arg2 arg1)]
    [else (parameterize ((trying-flipped? #t)) (op arg2 arg1))]))

Writing a simple wrapper works well for using commutative-double-dispatch in a class definition:

(define-syntax-rule (define/public/commutative-double-dispatch (op arg2) [class body ...] ...)
  (define/public (op arg2)
    (commutative-double-dispatch (lambda (a b) (send a op b)) (this arg2)
      [(lambda (v) (is-a? v class)) body ...] ...)))

but a wrapper for use with the generic interface system needs to take care not to accidentally shadow the outer dispatch mechanism. This macro uses define/generic to make op* an alias of op that always does a full dispatch on its arguments:

(define-syntax-rule (define/commutative-double-dispatch (op arg1 arg2) [pred? body ...] ...)
  (begin (define/generic op* op)
         (define (op arg1 arg2)
           (commutative-double-dispatch op* (arg1 arg2) [pred? body ...] ...))))

Examples

Let’s see the system in operation! First, using Racket’s object system, and then using Racket’s generic interfaces.

Example Scenario

We will first define two types of value foo and bar, each responding to a single doubly-dispatched method, operator which produces results according to the following table:

     | foo | bar |
-----|-----|-----|
 foo | foo | bar |
 bar | bar | foo |
-----|-----|-----|

Then, we’ll extend the system to include a third type, zot, which yields a zot when combined with any of the three types.

Double Dispatch with Classes

(define foo%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [foo% (new foo%)]
      [bar% (new bar%)])))

(define bar%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [bar% (new foo%)])))

Some tests show that this is doing what we expect. Notice that we get the right result when the first operand is a bar% and the second a foo%, even though bar% only explicitly specified the case for when the second operand is also a bar%. This shows the automatic argument-flipping in operation.

(module+ test
  (require rackunit)
  (check-true (is-a? (send (new foo%) operator (new foo%)) foo%))
  (check-true (is-a? (send (new foo%) operator (new bar%)) bar%))
  (check-true (is-a? (send (new bar%) operator (new foo%)) bar%))
  (check-true (is-a? (send (new bar%) operator (new bar%)) foo%)))

Double Dispatch with Generic Interfaces

(define-generics operand
  (operator operand other))

(struct foo ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [foo? (foo)]
     [bar? (bar)])])

(struct bar ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [bar? (foo)])])

The tests show the same argument-flipping behavior as for the object system above.

(module+ test
  (require rackunit)
  (check-true (foo? (operator (foo) (foo))))
  (check-true (bar? (operator (foo) (bar))))
  (check-true (bar? (operator (bar) (foo))))
  (check-true (foo? (operator (bar) (bar)))))

Extending The Example

First, we implement and test class zot%

(define zot%
  (class object%
    (super-new)
    (define/public/commutative-double-dispatch (operator other)
      [foo% (new zot%)]
      [bar% (new zot%)]
      [zot% (new zot%)])))

(module+ test
  (require rackunit)
  (check-true (is-a? (send (new foo%) operator (new zot%)) zot%))
  (check-true (is-a? (send (new bar%) operator (new zot%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new foo%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new bar%)) zot%))
  (check-true (is-a? (send (new zot%) operator (new zot%)) zot%)))

… and then implement and test struct zot.

(struct zot ()
  #:methods gen:operand
  [(define/commutative-double-dispatch (operator this other)
     [foo? (zot)]
     [bar? (zot)]
     [zot? (zot)])])

(module+ test
  (require rackunit)
  (check-true (zot? (operator (foo) (zot))))
  (check-true (zot? (operator (bar) (zot))))
  (check-true (zot? (operator (zot) (foo))))
  (check-true (zot? (operator (zot) (bar))))
  (check-true (zot? (operator (zot) (zot)))))

Conclusion

Double dispatch is a useful addition to the object-oriented programmer’s toolkit, and can be straightforwardly added to both of Racket’s object systems using its macro facility.


This post was written as executable, literate Racket. You can download the program from here.