My thinking about a next-generation SmallTalk-like system and language
has been shifting a bit over recent weeks.
To start with, I decided that the objects in the language would be
immutable: in order to replace a field value, an entirely new object
would be constructed, just like records in ML. Objects would no longer
have any identity, and object equivalence would be decided by
recursive comparison (a la Henry Baker’s egal predicate
[Baker93]). This immutability would extend even
to the object’s map (similar to its class) - adding a method to an
object results in a fresh object, with a fresh map.
Mutable state isn’t entirely absent, though - it’s just kept strictly
separate from other parts of the system, just as it is in ML. It would
be possible to construct mutable reference cells. Synchronisation,
communication and state access would be merged: when you retrieve a
value from a cell, the value is removed from the cell and handed to
the retrieving process. If no value is present, the receiving process
blocks until one is sent by another process. If a sending process
tries to place a value in a cell already occupied by a value, the
sending process dies with an exception. Cells are the locus of object
identity in the system, again as per [Baker93].
Metaprogramming and reflection would be enabled via locations, which
group together a set of related processes into a location
tree. Locations are responsible for the semantics of message dispatch
and exception handling. They’re the basic unit of reflection, too - a
location can be reified, which pauses and reifies all contained
processes and sublocations. A reified location can be used for
debugging, for mobile code, for become:-like operations, and
many other things. A user can install user code at a new sublocation,
which allows refinement or replacement of the default message dispatch
behaviour in the style of [Malenfant92].
Code itself in the system is a distinct entity - the instruction
stream contained in a method is a different kind of thing from all of
the categories discussed so far. It’s the role of the location to
interpret the code stream.
The dynamic state of a computation is held at the metalevel as a
process. Processes correspond to the state of a particular
interpreter: the registers, the stack, the current continuation
etc. They’re only accessible by reflecting on a location.
To sum up, then:
- objects are immutable, and have no identity;
- cells are mutable, have identity, and are the means of
communication and synchronisation in the system;
- locations are metalevel constructs that serve as
interpreters for code, that specify message dispatch and
exception-handling algorithms, and that are the loci of reflection
in the system.
- code is a stream of instructions intended for
interpretation by locations.
- processes are computations in the system running some
code at a location, manipulating cells and constructing and
transmitting objects.
Shifting from Object-Oriented to Message-Oriented
The way locations and code are laid out suggests strongly the infinite
tower of reflective interpreters discussed in [Jefferson92]. This tower of interpretation,
taken with the immutability of objects and the similarity of cells to
π-calculus ports, starts to make the system look more like a
message-oriented system than an object-oriented system.
The object-orientation is still present since we still late-bind code
to message sends, but the emphasis has changed: not only is there no
longer any behaviour necessarily attached to the objects - all the
behaviour is external, in the code resolved by the message-dispatch
algorithm in use - but there is no longer necessarily
any state associated with the objects either!
Objects in the system start to look more like messages than
objects. A collection of messages is bundled up with a selector and
sent to the metaobject for message dispatch, and a piece of code
specialised for handling that combination of arguments is selected and
invoked.
|
Smalltalk |
Self |
Slate |
ML (SML, OCaml) |
π-calculus |
ThiNG |
Language entities |
Objects, Classes, Code, Method Contexts, Block Contexts |
Objects, Code, Method Contexts, Block Contexts |
Objects, Code, Method Contexts, Block Contexts |
Tuples, Reference Cells, Functions, Evaluation Contexts |
Messages, Channels, Processes |
Messages, Channels, Processes/Code, Locations |
Transfer of control |
lookup/apply |
lookup/apply |
lookup/apply |
apply |
message-send |
lookup/message-send |
Kind of lookup |
single dispatch |
single dispatch |
multiple dispatch |
— |
— |
multiple dispatch |
Reflective ability |
full structural reflection; partial behavioural reflection |
full structural reflection; partial behavioural reflection (?) |
full structural reflection; partial behavioural reflection |
no reflection |
no reflection |
full structural, behavioural and lexical reflection |
(Aside: I find it interesting that a lot of OO thinking seems to
implicitly assume that in an OO system everything is an
object when that’s clearly not the case. Not only are there other
entities in the language - code, method contexts, and block contexts,
for instance - but they are metalevel entities, and tend not to be
first-class citizens. Their reified representations may be
objects, but the entities themselves are not. If you write down
expressions in an object calculus, you end up with things in the
expressions that aren’t objects.)
I can’t decide whether to stick with the
evaluate-every-argument-in-parallel model or not; it seems that there
are three obvious things that could work:
-
Evaluate every argument in parallel (just like the
current prototype). This is very inefficient on current CPUs. It
means that the system automatically exposes a lot of fine-grained
concurrency, though, which is nice.
-
Evaluate only annotated arguments in parallel (just
like Slate). This gives the programmer control over how much
concurrency they want in their program. Not so much fine-grained
concurrency is exposed, but on the other hand the code generated
could be quite efficient.
-
Tell the programmer to expect that every argument will be
evaluated in parallel, but secretly evaluate most of them in
serial. Some finessing will be required to avoid deadlocks caused by
overeager serialization of intercommunicating branches; one rough
guideline could be to serialize provably noncommunicating parallel
branches only up to the end of inlining. Once a call proceeds with a
real out-of-line call frame, act as for the
every-argument-in-parallel case. (I don’t know how to prove
noncommunication, yet, either. I haven’t really thought about it
yet.)
This concurrency business is looking more and more like the Next Big
Thing: here’s
an interesting article spelling out the trends and the coming end of
the clock-speed-increase “free lunch”. (That link via LTU).
Message-Oriented Programming
It turns out that the term “Message-Oriented Programming” isn’t new -
there’s an existing body of work using the term, sometimes in the way
I’d like to use it, sometimes in a related but different sense:
References
-
[Baker93] “Equal
Rights for Functional Objects or, The More Things Change, The More
They Are the Same”, Henry G. Baker, ACM OOPS Messenger 4, 4 (Oct
1993), 2-27
-
[Jefferson92] “A Simple
Reflective Interpreter”, Stanley Jefferson and Daniel
P. Friedman, IMSA’92 International Workshop on Reflection and
Meta-level architecture, Tokyo, 1992
-
[Malenfant92] “Behavioral
Reflection in a Prototype-Based Language”, J. Malenfant,
C. Dony, P Cointe, in proceedings of International Workshop on
Reflection and Meta-Level Architectures, Tokyo, 1992