I am so far from sleep. It is not funny.
I have been thinking about this Smalltalk-TNG idea for a few hours now. I've been considering what it would take to make an efficientish interpreter/JIT/whatever for a language that is more-or-less a concurrent variant of Smalltalk '72, the one where each receiver does its own dispatch on the message it is sent.
- Start from a π-calculus variant.
- Messages are really messages, transmitted to the receiving process via a real channel, received by a real channel-input operation. The receiver is a real process, reading a message at a time and creating replicated input by explicit recursion. Sounds slow, doesn't it? Read on.
- Use the Squeak (and for all I know traditional Smalltalk) technique of hashconsing message selectors, taking the pointer to the selector, XORing it with the pointer to the class &mdash or whatever provides the method dictionary — and using that value as a hash into a global method cache/memoization table.
- Clever representation under the hood pops out. The JITter/compiler/interpreter ought to be able to recognise common patterns of send/replywait, as in the normal pattern generated by a CPS transform. The system can optimise these special patterns. When it sees a CPS-transformed RPC pattern it can simply make a call, without bothering to reify the reply port. That can happen on the receiving side, should it be considered necessary.
- The scheduler can do the traditional timeslicing thing if there is real parallelism. Once a sequential "thread" is over, the scheduler regains control automatically of course and can simply pop the next item off the work queue and keep on trucking sequentially.
- SMP systems ought to communicate via message passing rather than shared memory. Each CPU becomes a real separate network-linked compute resource. Display and peripherals I guess have to be split across the SMP machine somehow — process migration? Object cloning, a la Obliq?
- The representation of channels will be interesting. Since we don't have SMP to worry about, we don't need to lock — ever — so placing a (pseudo-) message on a channel is as simple as arranging the message and calling the function-pointer held within the channel itself!
- For objects, or object-like structures, the functionpointer will process the message efficiently in order to determine the intended method, and will then pass the details on to the method closure. For anything else, the functionpointer will simply place the message on a queue if there's no receiver waiting. If there is a receiver waiting, it might be an RPC-style receiver in which case another simple stack based call can be made; alternatively the continuation can be hooked up with the message and scheduled normally for the scheduler to get to in its own good time. At every level, the common-case pattern can be optimised with robust fallbacks for the less usual cases.
- For the common case of a method-call-style-message-send-and-replywait, neither the message nor the reply port need be eagerly constructed. The calling convention for channel functionpointers has it so that on entry to the functionpointer, if a certain register is zero, say, then the message and waiting continuation are implicit in the parameter registers and the waiting stack frame. Should they need reifying, it's obvious where to look and how to go about it; otherwise, let them stay implied, and when the reply appears for the continuation, simply do a stack return as usual for a sequential language. If the special register was nonzero, then a real message would be waiting and the system would fall back to a slower but more general message delivery and rendezvous implementation. This lets sequential code run sequentially.
- Almost everything can be done late. As Alan Kay puts it in "The Early History of Smalltalk", "The art of the wrap is the art of the trap". Treat the common case as the fast path and trap on any deviation from the norm. Clever use of indirection and function pointers, combined with the metaphor of the receiver doing its own dispatching (which applies at multiple levels here — not only at the language metaphor level but also in terms of implementation details, with the channel semantics being represented by a functionpointer) can help automatically trap a lot of the time without special hardware support.
- A note on these functionpointers: replicated input RPC-style processes, ie. functions (closures) can be efficiently represented as channels with a functionpointer that does the work of the closure! No message passing overhead at all! Note also that this applies equally to primitives as to user blocks (in the sense of a Smalltalk block, aka a lambda term).
- When the scheduler runs out of work, that means the system has quiesced. Is it just waiting for an interrupt (user/network/disk activity), or has it deadlocked somewhere?
- What about image saving? EROS style checkpointing? Can we get that reliable, efficiently? Have a small "emergencies-only" image in case you corrupt your mainline image... signed primitives can come in over the net and be installed by the user based on a web of trust... if you upgrade your CopyBits, and it starts crashing, boot into the emergency-only image and repair the damage.
- How can conditionals be efficiently compiled? How does Smalltalk detect when #ifTrue:ifFalse: hasn't been reimplemented in the receiver? There's special bytecode for conditional jumps in Squeak.
- Object representation — cons up whole vectors of fresh channels at once. These are used for instance variables.
Christ, it was thirty-two years ago Smalltalk '72 was being written and used. That's a sobering thought. Well, now that I've got some of that stuff out of my head, perhaps I'll be able to sleep.