Transactional Memory

I just read “Composable memory transactions” (Harris, Marlow, Peyton-Jones, Herlihy) on Friday. It’s really very similar to what Henry Baker was talking about and what I had in mind for the transactionality of mutable state in ThiNG, but it’s different enough to make me consider changing ThiNG to be more like the language they discuss in the paper. The advantage is that you can build communication primitives out of transactional shared memory - see their discussion of MVars, which are essentially identical to the cells I’ve been considering for ThiNG - whereas it’s more difficult to do things the other way around.

Unrolling Recursion, the Golden Ratio, and Partial Evaluation

The little language shootout I ran a few weeks ago showed that Java had a significant edge over the other contestants. I speculated at the time that it might have been because Java was unrolling the fib loop once, inlining one of the recursive invocations, which would have removed one layer from the call tree - essentially halving the runtime. (More accurately, dividing the runtime by φ = 1.618…)

I just now ran the same program unrolled once by hand, and with the unrolled version the C code has almost identical (well within the margin of error!) runtime to the Java code. Here’s the faster C code:

static unsigned long fib(int n) {
  if (n < 2) {
    return 1;
  } else {
    if (n < 3) {
      return 2;
    } else {
      return fib(n - 1) + fib(n - 2);
    }
  }
}

Essentially, it’s inlining that recurses once, allowing a self-recursive procedure to inline one layer of self-invocation. It’s a neat trick - it’s discussed in section 7.1.2.2 of Craig Chambers’s thesis on the earlier optimizing Self compiler. Self only inlines once (i.e. nonrecursively) - clearly, Java inlines twice occasionally.

Further unrolling the recursion gives commensurate speedups, as you would expect: each unrolling divides the runtime by φ, roughly, so you can easily speed up the C code by about a factor of ten compared to the completely non-unrolled version by unrolling until you return 13 constantly after an explicit test that n < 7.

All this inlining/unrolling/constant-propagation seems really closely related to partial evaluation, too… Hidehiko Masuhara’s PhD thesis (U. Tokyo, 1999) is an interesting read, applying partial evaluation to reflective interpreters to give vastly improved performance for behaviourally-reflective languages. Integrating a partial evaluation system with a Self-like dynamic optimising compiler seems like a worthwhile project - something that is likely to be necessary for the implementation of ThiNG, perhaps, given how reflective I imagine it will end up being.

Clipping in Cairo

While doing some hacking on Zowie, I noticed I was getting drastic slowdown when drawing large, clipped objects — typically when something is almost in the zooming-center and skids off to the side while very ‘close’ to the lens.

It turns out that clipping can be quite expensive in Cairo, under certain circumstances. When the clipping is complex or not aligned on pixel boundaries, Cairo resorts to using an intermediate buffer and compositing back to the surface in the context.

I have a couple of solutions on my plate:

  1. Make sure clipping paths are rectangular and pixel-aligned
  2. Don’t bother even trying to draw things that would be fully clipped

I’ve implemented the latter just by testing if bounding boxes overlap the clipping region. Just this makes a huge difference! Cairo seems to keep up with painting plenty of shapes (rectangles, admittedly), so long as the large ones end up mostly off-screen. It is possible that there could be large, partially clipped shapes that don’t comply and stay in view (it doesn’t tend to happen in my example code at the minute); for that reason, I might need to experiment with the latter solution. It’s a bit tricky because clipping regions can be at the bottom of a stack of transforms, so to get pixel co-ordinates I need to transform all the way up to world co-ordinates, apply the clip rectangle, then go back to where I was.

(BTW I do plan to make some of this hacking available eventually — my first target is to have zooming with the mouse and an editable text widget, and they are both in reach ..)

Reflection and Message-Oriented Programming

My thinking about a next-generation SmallTalk-like system and language has been shifting a bit over recent weeks.

To start with, I decided that the objects in the language would be immutable: in order to replace a field value, an entirely new object would be constructed, just like records in ML. Objects would no longer have any identity, and object equivalence would be decided by recursive comparison (a la Henry Baker’s egal predicate [Baker93]). This immutability would extend even to the object’s map (similar to its class) - adding a method to an object results in a fresh object, with a fresh map.

Mutable state isn’t entirely absent, though - it’s just kept strictly separate from other parts of the system, just as it is in ML. It would be possible to construct mutable reference cells. Synchronisation, communication and state access would be merged: when you retrieve a value from a cell, the value is removed from the cell and handed to the retrieving process. If no value is present, the receiving process blocks until one is sent by another process. If a sending process tries to place a value in a cell already occupied by a value, the sending process dies with an exception. Cells are the locus of object identity in the system, again as per [Baker93].

Metaprogramming and reflection would be enabled via locations, which group together a set of related processes into a location tree. Locations are responsible for the semantics of message dispatch and exception handling. They’re the basic unit of reflection, too - a location can be reified, which pauses and reifies all contained processes and sublocations. A reified location can be used for debugging, for mobile code, for become:-like operations, and many other things. A user can install user code at a new sublocation, which allows refinement or replacement of the default message dispatch behaviour in the style of [Malenfant92].

Code itself in the system is a distinct entity - the instruction stream contained in a method is a different kind of thing from all of the categories discussed so far. It’s the role of the location to interpret the code stream.

The dynamic state of a computation is held at the metalevel as a process. Processes correspond to the state of a particular interpreter: the registers, the stack, the current continuation etc. They’re only accessible by reflecting on a location.

To sum up, then:

  • objects are immutable, and have no identity;
  • cells are mutable, have identity, and are the means of communication and synchronisation in the system;
  • locations are metalevel constructs that serve as interpreters for code, that specify message dispatch and exception-handling algorithms, and that are the loci of reflection in the system.
  • code is a stream of instructions intended for interpretation by locations.
  • processes are computations in the system running some code at a location, manipulating cells and constructing and transmitting objects.
Shifting from Object-Oriented to Message-Oriented

The way locations and code are laid out suggests strongly the infinite tower of reflective interpreters discussed in [Jefferson92]. This tower of interpretation, taken with the immutability of objects and the similarity of cells to π-calculus ports, starts to make the system look more like a message-oriented system than an object-oriented system.

The object-orientation is still present since we still late-bind code to message sends, but the emphasis has changed: not only is there no longer any behaviour necessarily attached to the objects - all the behaviour is external, in the code resolved by the message-dispatch algorithm in use - but there is no longer necessarily any state associated with the objects either!

Objects in the system start to look more like messages than objects. A collection of messages is bundled up with a selector and sent to the metaobject for message dispatch, and a piece of code specialised for handling that combination of arguments is selected and invoked.

Smalltalk Self Slate ML (SML, OCaml) π-calculus ThiNG
Language entities Objects, Classes, Code, Method Contexts, Block Contexts Objects, Code, Method Contexts, Block Contexts Objects, Code, Method Contexts, Block Contexts Tuples, Reference Cells, Functions, Evaluation Contexts Messages, Channels, Processes Messages, Channels, Processes/Code, Locations
Transfer of control lookup/apply lookup/apply lookup/apply apply message-send lookup/message-send
Kind of lookup single dispatch single dispatch multiple dispatch multiple dispatch
Reflective ability full structural reflection; partial behavioural reflection full structural reflection; partial behavioural reflection (?) full structural reflection; partial behavioural reflection no reflection no reflection full structural, behavioural and lexical reflection

(Aside: I find it interesting that a lot of OO thinking seems to implicitly assume that in an OO system everything is an object when that’s clearly not the case. Not only are there other entities in the language - code, method contexts, and block contexts, for instance - but they are metalevel entities, and tend not to be first-class citizens. Their reified representations may be objects, but the entities themselves are not. If you write down expressions in an object calculus, you end up with things in the expressions that aren’t objects.)

I can’t decide whether to stick with the evaluate-every-argument-in-parallel model or not; it seems that there are three obvious things that could work:

  1. Evaluate every argument in parallel (just like the current prototype). This is very inefficient on current CPUs. It means that the system automatically exposes a lot of fine-grained concurrency, though, which is nice.

  2. Evaluate only annotated arguments in parallel (just like Slate). This gives the programmer control over how much concurrency they want in their program. Not so much fine-grained concurrency is exposed, but on the other hand the code generated could be quite efficient.

  3. Tell the programmer to expect that every argument will be evaluated in parallel, but secretly evaluate most of them in serial. Some finessing will be required to avoid deadlocks caused by overeager serialization of intercommunicating branches; one rough guideline could be to serialize provably noncommunicating parallel branches only up to the end of inlining. Once a call proceeds with a real out-of-line call frame, act as for the every-argument-in-parallel case. (I don’t know how to prove noncommunication, yet, either. I haven’t really thought about it yet.)

This concurrency business is looking more and more like the Next Big Thing: here’s an interesting article spelling out the trends and the coming end of the clock-speed-increase “free lunch”. (That link via LTU).

Message-Oriented Programming

It turns out that the term “Message-Oriented Programming” isn’t new - there’s an existing body of work using the term, sometimes in the way I’d like to use it, sometimes in a related but different sense:

References

Chicken SDL v0.3p2

A new release of chicken-sdl is available from our arch archive. You can download the egg file here.

Interesting changes include:

  • Case changes: constant definitions are now in uppercase, matching the C API.
  • The SDL egg now supports SDL_net networking.
  • The sdl-csi program is now installed in the bin directory, as part of the normal installation of the egg
  • Added missing sdl-quit-sub-system parameter; added sdl-set-color-key! (thanks to Felix Winkelmann)
  • Added a sdl-pixel-format-bytes-per-pixel accessor (thanks to Sunnan)
  • Added getter/setter for SDL errors.
  • The sdl-rect-nil global has been replaced by #f.
  • The problem on MacOS X (and probably also Windows) regarding sdl-init has been better documented (in the README)

Chicken SDL v0.3p1

A new release of chicken-sdl is available from our arch archive. You can download the egg file here.

Below is the relevant part of the changelog:

2004-12-01 13:18:11 GMT Tony Garnock-Jones <tonyg@kcbbs.gen.nz>   patch-10

    Summary:
      Incorporate changes suggested by Alex Shinn
    Revision:
      chicken-sdl--main--0--patch-10

    - (sdl-fill-rect) is now generic in the kind of colour it expects -
      either an unsigned integer, or an SDL_Color structure.

    - Added procedures (ttf-size-text!) and (ttf-size-utf8!).


2004-11-19 16:38:01 GMT Tony Garnock-Jones <tonyg@lshift.net>     patch-9

    Summary:
      Returned TTF_Fonts now have a NULL check
    Revision:
      chicken-sdl--main--0--patch-9

    Previously, calling (ttf-open-font) when the path specified was
    invalid in some way would result in the construction of a ttf-font
    structure with a pointer value of #f, and there was no way of checking
    the success or failure of (ttf-open-font) without explicitly calling
    (ttf-font-pointer), which is bad style.

    I've changed the foreign-type wrap procedure to check for a #f pointer
    before constructing the ttf-font structure. If the pointer is #f, the
    whole result is #f - otherwise a ttf-font structure is built as usual.


2004-11-15 12:09:04 GMT Tony Garnock-Jones <tonyg@kcbbs.gen.nz>   patch-8

    Summary:
      Replace timer functionality
    Revision:
      chicken-sdl--main--0--patch-8

    It is problematic supporting SDL_AddTimer and SDL_RemoveTimer from
    chicken, since they a) are implemented using setitimer(2) and b)
    involve callbacks from C to Scheme. Each would be fine on its own, but
    taken together they interfere with the main Scheme thread.

    As it happens, the SDL_WaitEvent function is implemented in terms of
    polling (!) for events, with a short delay if none present themselves
    - the usual pragmatic tradeoff for event-based systems on Unix-like
    machines - and so we will be doing no worse if we do a similar thing
    ourselves. Hence, I've written a Scheme-based timer library which
    integrates with SDL's event loop, calling SDL_Delay(10) when there's
    no work just like SDL_WaitEvent.

    This does of course mean that the user must now either use
    (sdl-wait-event!), which manages timeouts itself, or must instead
    somehow arrange for periodic calls to (sdl-process-timer-queue!).

    The timer library is implemented in timer.scm, and depends on a
    leftist-heap implementation from heap.scm. These two modules ought
    probably to be split out into eggs of their own, and distributed as
    part of some kind of general utility library for chicken. For now,
    they're (include)d directly into sdl.scm, to avoid needing to install
    more than one file for this extension.

Chicken SDL and Chicken Cairo

For your consideration, alpha releases of SDL and Cairo bindings for Chicken Scheme.

SDL is a cross-platform library for accessing video, audio and HID events. Cairo is a vector-graphics library that happens to be easily integrated with SDL.

Both are available in our GNU/Arch archive at http://www.eighty-twenty.org/archives:

tla register-archive http://www.eighty-twenty.org/archives
tla abrowse arch@eighty-twenty.org-public--2004

A Little Language Shootout

Today, on a whim, I decided to “benchmark” a very old scheme-like interpreter I had lying around from almost a decade ago. The “benchmark” I chose was that old stalwart, the Fibonacci function. Since I have a PowerBook G4 1GHz (512MB RAM, MacOS X 10.3), I chose to investigate the runtime of (fib 29).

The results I got when comparing my lisp2’s runtimes against other Scheme implementations and other programming languages were somewhat surprising.

  1. Java came top of the list!
  2. Self, an insanely dynamic language, was a staggering 2/3 as fast as optimised C (1/2 as fast as Java). This was the real surprise. This result means (loosely) that for every two instructions the C compiler is emitting for microbenchmarks like this, the Self compiler is emitting three. Considering the amazing flexibility of Self, that seems a trivially small price to pay.
Language Implementation Detail Time (per (fib 29) call, in milliseconds) Ops/s Ratio (opt. C) Ratio (unopt. C)
Java JDK 1.4.2_05 38 576027000 0.70 0.335
C gcc 3.3, -O3 56 401161000 1 0.478
SML SML/NJ 110.44 67 335299000 1.20 0.574
Self Self 4.2 84 267441000 1.5 0.718
C gcc 3.3, without any optimisation flags 117 192009000 2.09 1
SmallTalk-80 Squeak 3.8alpha 653 34403000 11.7 5.6
Scheme Chicken 1.73, compiled, -O3 890 25241000 15.9 7.6
Scheme MzScheme v206 compiled to C using "mzc --exe" 1425 15764000 25 12
Scheme Chicken 1.73, compiled, without any optimisation flags 1907 11780000 34 16
Scheme MzScheme v206 interpreter 3523 6376000 63 30
Scheme Chicken 1.72 interpreter 3900 5760000 70 33
Scheme lisp2, my ancient interpreter 6310 3560000 113 54
Scheme TinyScheme v1.31 36681 612000 655 313

The “Ops/s” column contains a kind of BogoMips value I cooked up based on the instruction set of my ancient lisp2 interpreter.

For reference, the code for the various versions follows. Most of the Scheme implementations had a (time) macro for measuring execution time; I used gettimeofday() for the C code, System.currentTimeMillis() for the Java, and Time.now () for the SML. For the Squeak SmallTalk version I used the timeToRun method on class Block, and for Self I used the cpuTime method on traits Block.

Come to think of it, I could have written the timing algorithms to be a little more accurate, especially for the faster implementations, but for such a trivial microbenchmark it’s not really worth the bother. The timing I did is good enough for the ballpark figures I was after.

Here’s the code for the Scheme version:

(define (fib n)
  (if (< n 2)
      n
      (+ (fib (- n 1))
         (fib (- n 2)))))

Here’s the core of the C version:

static unsigned long fib(int n) {
  if (n < 2) {
    return n;
  } else {
    return fib(n - 1) + fib(n - 2);
  }
}

Here’s the core of the Java version:

public static long fib(int n) {
    if (n < 2) {
        return n;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

Here’s the core of the SML version:

fun fib 0 = 0
  | fib 1 = 1
  | fib n = (fib (n - 1)) + (fib (n - 2))

The Squeak version - this method was placed on class SmallInt:

self < 2 ifTrue: [^ self].
^ (self - 1) fibonacci + (self - 2) fibonacci

And finally, here’s the Self version. Note that the following code already existed in the image, on traits Integer:

<= 1 ifTrue: [ self ]
      False: [ predecessor fibonacci + (- 2) fibonacci ]

I should try coding up Doug Bagley’s Language Shootout benchmarks in Self, just to see where it places overall. I should also try all the above on Intel x86 CPUs.

What an achievement Self is!

Midnight Inspiration

I am so far from sleep. It is not funny.

I have been thinking about this Smalltalk-TNG idea for a few hours now. I’ve been considering what it would take to make an efficientish interpreter/JIT/whatever for a language that is more-or-less a concurrent variant of Smalltalk ‘72, the one where each receiver does its own dispatch on the message it is sent.

  • Start from a π-calculus variant.

  • Messages are really messages, transmitted to the receiving process via a real channel, received by a real channel-input operation. The receiver is a real process, reading a message at a time and creating replicated input by explicit recursion. Sounds slow, doesn’t it? Read on.

  • Use the Squeak (and for all I know traditional Smalltalk) technique of hashconsing message selectors, taking the pointer to the selector, XORing it with the pointer to the class &mdash or whatever provides the method dictionary — and using that value as a hash into a global method cache/memoization table.

  • Clever representation under the hood pops out. The JITter/compiler/interpreter ought to be able to recognise common patterns of send/replywait, as in the normal pattern generated by a CPS transform. The system can optimise these special patterns. When it sees a CPS-transformed RPC pattern it can simply make a call, without bothering to reify the reply port. That can happen on the receiving side, should it be considered necessary.

  • The scheduler can do the traditional timeslicing thing if there is real parallelism. Once a sequential “thread” is over, the scheduler regains control automatically of course and can simply pop the next item off the work queue and keep on trucking sequentially.

  • SMP systems ought to communicate via message passing rather than shared memory. Each CPU becomes a real separate network-linked compute resource. Display and peripherals I guess have to be split across the SMP machine somehow — process migration? Object cloning, a la Obliq?

  • The representation of channels will be interesting. Since we don’t have SMP to worry about, we don’t need to lock — ever — so placing a (pseudo-) message on a channel is as simple as arranging the message and calling the function-pointer held within the channel itself!

  • For objects, or object-like structures, the functionpointer will process the message efficiently in order to determine the intended method, and will then pass the details on to the method closure. For anything else, the functionpointer will simply place the message on a queue if there’s no receiver waiting. If there is a receiver waiting, it might be an RPC-style receiver in which case another simple stack based call can be made; alternatively the continuation can be hooked up with the message and scheduled normally for the scheduler to get to in its own good time. At every level, the common-case pattern can be optimised with robust fallbacks for the less usual cases.

  • For the common case of a method-call-style-message-send-and-replywait, neither the message nor the reply port need be eagerly constructed. The calling convention for channel functionpointers has it so that on entry to the functionpointer, if a certain register is zero, say, then the message and waiting continuation are implicit in the parameter registers and the waiting stack frame. Should they need reifying, it’s obvious where to look and how to go about it; otherwise, let them stay implied, and when the reply appears for the continuation, simply do a stack return as usual for a sequential language. If the special register was nonzero, then a real message would be waiting and the system would fall back to a slower but more general message delivery and rendezvous implementation. This lets sequential code run sequentially.

  • Almost everything can be done late. As Alan Kay puts it in “The Early History of Smalltalk”, “The art of the wrap is the art of the trap”. Treat the common case as the fast path and trap on any deviation from the norm. Clever use of indirection and function pointers, combined with the metaphor of the receiver doing its own dispatching (which applies at multiple levels here — not only at the language metaphor level but also in terms of implementation details, with the channel semantics being represented by a functionpointer) can help automatically trap a lot of the time without special hardware support.

  • A note on these functionpointers: replicated input RPC-style processes, ie. functions (closures) can be efficiently represented as channels with a functionpointer that does the work of the closure! No message passing overhead at all! Note also that this applies equally to primitives as to user blocks (in the sense of a Smalltalk block, aka a lambda term).

  • When the scheduler runs out of work, that means the system has quiesced. Is it just waiting for an interrupt (user/network/disk activity), or has it deadlocked somewhere?

  • What about image saving? EROS style checkpointing? Can we get that reliable, efficiently? Have a small “emergencies-only” image in case you corrupt your mainline image… signed primitives can come in over the net and be installed by the user based on a web of trust… if you upgrade your CopyBits, and it starts crashing, boot into the emergency-only image and repair the damage.

  • Object models! The system supports traditional Smalltalk style class- (and metaclass-) based dispatch, but it equally efficiently supports Self- or Javascript-like prototype object dispatch, or Python-like class-based dispatch etc etc. Scheme/Lisp-style objects based on closures can also be efficiently represented. An annotation on code as a hint to the interpreter/JIT that a certain RPC can be memoised weakly (ie. in the global method cache) is sufficient for all these forms of dispatch. The one thing missing is multimethod dispatch — how might that work? The Slate project at tunes.org might have some literature on this.

  • How can conditionals be efficiently compiled? How does Smalltalk detect when #ifTrue:ifFalse: hasn’t been reimplemented in the receiver? There’s special bytecode for conditional jumps in Squeak.

  • Object representation — cons up whole vectors of fresh channels at once. These are used for instance variables.

Christ, it was thirty-two years ago Smalltalk ‘72 was being written and used. That’s a sobering thought. Well, now that I’ve got some of that stuff out of my head, perhaps I’ll be able to sleep.

StoJ, an implementation of Stochastic Join

Well, since I wrote about some of the issues in stochastic-π based biological modelling, I’ve had a little spare time, a little inspiration, and a supportive workplace environment (thanks again, LShift!), and I’ve implemented StoJ, a polyadic, asynchronous stochastic π calculus with input join and no summation. Rates are associated with a join, not with a channel. I implemented Gillespie’s “first reaction” method (it seemed simplest — and despite its inefficiency it’s performing reasonably well, although I suspect I have Moore’s “law” to thank for that more than anything else).

I used OCaml as the implementation language — the parsing tools are quite smooth and good to work with — and gnuplot to graph the output of each simulation run. This is my first OCaml project, and I’m finding it quite pleasant despite the syntax.

Here’s an example model in StoJ. I duplicated the rates and the overall design of the model from the Clock.zip code archive at the BioSPI website. This model is discussed in Aviv Regev’s paper “Representation and simulation of molecular pathways in the stochastic π-calculus.” The implementation of the model below gives the same output graphs as Regev’s model gives, which is a nice confirmation of the model and of the StoJ system itself.

// Circadian clock model
// Based on Regev et al

new
        DNA_a, DNA_r,
        !MRNA_a, !MRNA_r,
        !protein_a, !protein_r,
        DNA_a_promoted, DNA_r_promoted,
        !complex_ar . (

// Start state.
DNA_a<> | DNA_r<>

// Basal rate transcription.
| rec L. DNA_a() ->[4]       (DNA_a<> | MRNA_a<> | L)
| rec L. DNA_r() ->[0.001]   (DNA_r<> | MRNA_r<> | L)

// MRNA degradation.
| rec L. MRNA_a() ->[1.0]    (L)
| rec L. MRNA_r() ->[0.02]   (L)

// Translation.
| rec L. MRNA_a() ->[1.0]    (MRNA_a<> | protein_a<> | L)
| rec L. MRNA_r() ->[0.1]    (MRNA_r<> | protein_r<> | L)

// Protein degradation.
| rec L. protein_a() ->[0.1] (L)
| rec L. protein_r() ->[0.01]        (L)

// A/R Complex formation.
| rec L. protein_a() & protein_r() ->[100.0]     (complex_ar<> | L)

// A/R Complex degradation.
| rec L. complex_ar() ->[0.1]        (protein_r<> | L)
| rec L. complex_ar() ->[0.01]       (protein_a<> | L)

// Activator binding/unbinding to A promoter, and enhanced transcription
| rec L. protein_a() & DNA_a() ->[10]    (DNA_a_promoted<> | L)
| rec L. DNA_a_promoted() ->[10]         (protein_a<> | DNA_a<> | L)
| rec L. DNA_a_promoted() ->[40]         (DNA_a_promoted<> | MRNA_a<> | L)

// Activator binding/unbinding to R promoter, and enhanced transcription
| rec L. protein_a() & DNA_r() ->[10]    (DNA_r_promoted<> | L)
| rec L. DNA_r_promoted() ->[100]        (protein_a<> | DNA_r<> | L)
| rec L. DNA_r_promoted() ->[2]          (DNA_r_promoted<> | MRNA_r<> | L)

)

Here are a couple of graphs produced by the simulator from the program above — first a graph of reagent concentrations varying with time, and then a graph of Repressor (R) molecule concentration against Activator (A) molecule concentration, showing the bistable nature of the whole system. If you take a look at the graphs in Regev’s paper, you’ll see that the spikes in reagent concentration occur in the same places with the same frequency in our model as in Regev’s.

Concentrations against time

R vs A

I also implemented a simple Java applet/application that uses a (human-assisted) mass-spring layout to draw a graph of the (statically apparent) dataflow in the StoJ program it takes as input. It still needs a bit of refining — the graphs/programs tend to be quite densely connected in places, and not only is mass-spring layout poor at densely-connected graphs, especially with cycles, but it doesn’t identify any potential clustering in the graph either. I get the feeling clustering (subgraphs) will be important for visualising dataflow in larger models.

Here’s a simple chemical equilibrium model in StoJ:

// A simple H + Cl <--> HCl model,
// similar to that in Andrew Phillips' slides
// with numbers taken from his slides.
// See http://www.doc.ic.ac.uk/~anp/spim/Slides.pdf
// page 16.

new !h, !cl, !hcl . (
  rec ASSOC.  h() & cl() ->[100] (hcl<> | ASSOC)
| rec DISSOC. hcl() ->[10] (h<> | cl<> | DISSOC)
| h<> * 100 | cl<> * 100
)

And here’s the display generated by the Java program visualisation application:

Simple chemical equilibrium visualisation

As you can imagine, for larger models the display gets cluttered quite quickly. A note on implementation though - I found Java2D to be surprisingly pleasant to work with. The built-in support for double-buffering of JComponents along with the various classes of Shape helped me quickly develop the application.