Chicken SDL v0.3p1

A new release of chicken-sdl is available from our arch archive. You can download the egg file here.

Below is the relevant part of the changelog:

2004-12-01 13:18:11 GMT Tony Garnock-Jones <tonyg@kcbbs.gen.nz>   patch-10

    Summary:
      Incorporate changes suggested by Alex Shinn
    Revision:
      chicken-sdl--main--0--patch-10

    - (sdl-fill-rect) is now generic in the kind of colour it expects -
      either an unsigned integer, or an SDL_Color structure.

    - Added procedures (ttf-size-text!) and (ttf-size-utf8!).


2004-11-19 16:38:01 GMT Tony Garnock-Jones <tonyg@lshift.net>     patch-9

    Summary:
      Returned TTF_Fonts now have a NULL check
    Revision:
      chicken-sdl--main--0--patch-9

    Previously, calling (ttf-open-font) when the path specified was
    invalid in some way would result in the construction of a ttf-font
    structure with a pointer value of #f, and there was no way of checking
    the success or failure of (ttf-open-font) without explicitly calling
    (ttf-font-pointer), which is bad style.

    I've changed the foreign-type wrap procedure to check for a #f pointer
    before constructing the ttf-font structure. If the pointer is #f, the
    whole result is #f - otherwise a ttf-font structure is built as usual.


2004-11-15 12:09:04 GMT Tony Garnock-Jones <tonyg@kcbbs.gen.nz>   patch-8

    Summary:
      Replace timer functionality
    Revision:
      chicken-sdl--main--0--patch-8

    It is problematic supporting SDL_AddTimer and SDL_RemoveTimer from
    chicken, since they a) are implemented using setitimer(2) and b)
    involve callbacks from C to Scheme. Each would be fine on its own, but
    taken together they interfere with the main Scheme thread.

    As it happens, the SDL_WaitEvent function is implemented in terms of
    polling (!) for events, with a short delay if none present themselves
    - the usual pragmatic tradeoff for event-based systems on Unix-like
    machines - and so we will be doing no worse if we do a similar thing
    ourselves. Hence, I've written a Scheme-based timer library which
    integrates with SDL's event loop, calling SDL_Delay(10) when there's
    no work just like SDL_WaitEvent.

    This does of course mean that the user must now either use
    (sdl-wait-event!), which manages timeouts itself, or must instead
    somehow arrange for periodic calls to (sdl-process-timer-queue!).

    The timer library is implemented in timer.scm, and depends on a
    leftist-heap implementation from heap.scm. These two modules ought
    probably to be split out into eggs of their own, and distributed as
    part of some kind of general utility library for chicken. For now,
    they're (include)d directly into sdl.scm, to avoid needing to install
    more than one file for this extension.

Chicken SDL and Chicken Cairo

For your consideration, alpha releases of SDL and Cairo bindings for Chicken Scheme.

SDL is a cross-platform library for accessing video, audio and HID events. Cairo is a vector-graphics library that happens to be easily integrated with SDL.

Both are available in our GNU/Arch archive at http://www.eighty-twenty.org/archives:

tla register-archive http://www.eighty-twenty.org/archives
tla abrowse arch@eighty-twenty.org-public--2004

A Little Language Shootout

Today, on a whim, I decided to “benchmark” a very old scheme-like interpreter I had lying around from almost a decade ago. The “benchmark” I chose was that old stalwart, the Fibonacci function. Since I have a PowerBook G4 1GHz (512MB RAM, MacOS X 10.3), I chose to investigate the runtime of (fib 29).

The results I got when comparing my lisp2’s runtimes against other Scheme implementations and other programming languages were somewhat surprising.

  1. Java came top of the list!
  2. Self, an insanely dynamic language, was a staggering 2/3 as fast as optimised C (1/2 as fast as Java). This was the real surprise. This result means (loosely) that for every two instructions the C compiler is emitting for microbenchmarks like this, the Self compiler is emitting three. Considering the amazing flexibility of Self, that seems a trivially small price to pay.
Language Implementation Detail Time (per (fib 29) call, in milliseconds) Ops/s Ratio (opt. C) Ratio (unopt. C)
Java JDK 1.4.2_05 38 576027000 0.70 0.335
C gcc 3.3, -O3 56 401161000 1 0.478
SML SML/NJ 110.44 67 335299000 1.20 0.574
Self Self 4.2 84 267441000 1.5 0.718
C gcc 3.3, without any optimisation flags 117 192009000 2.09 1
SmallTalk-80 Squeak 3.8alpha 653 34403000 11.7 5.6
Scheme Chicken 1.73, compiled, -O3 890 25241000 15.9 7.6
Scheme MzScheme v206 compiled to C using "mzc --exe" 1425 15764000 25 12
Scheme Chicken 1.73, compiled, without any optimisation flags 1907 11780000 34 16
Scheme MzScheme v206 interpreter 3523 6376000 63 30
Scheme Chicken 1.72 interpreter 3900 5760000 70 33
Scheme lisp2, my ancient interpreter 6310 3560000 113 54
Scheme TinyScheme v1.31 36681 612000 655 313

The “Ops/s” column contains a kind of BogoMips value I cooked up based on the instruction set of my ancient lisp2 interpreter.

For reference, the code for the various versions follows. Most of the Scheme implementations had a (time) macro for measuring execution time; I used gettimeofday() for the C code, System.currentTimeMillis() for the Java, and Time.now () for the SML. For the Squeak SmallTalk version I used the timeToRun method on class Block, and for Self I used the cpuTime method on traits Block.

Come to think of it, I could have written the timing algorithms to be a little more accurate, especially for the faster implementations, but for such a trivial microbenchmark it’s not really worth the bother. The timing I did is good enough for the ballpark figures I was after.

Here’s the code for the Scheme version:

(define (fib n)
  (if (< n 2)
      n
      (+ (fib (- n 1))
         (fib (- n 2)))))

Here’s the core of the C version:

static unsigned long fib(int n) {
  if (n < 2) {
    return n;
  } else {
    return fib(n - 1) + fib(n - 2);
  }
}

Here’s the core of the Java version:

public static long fib(int n) {
    if (n < 2) {
        return n;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

Here’s the core of the SML version:

fun fib 0 = 0
  | fib 1 = 1
  | fib n = (fib (n - 1)) + (fib (n - 2))

The Squeak version - this method was placed on class SmallInt:

self < 2 ifTrue: [^ self].
^ (self - 1) fibonacci + (self - 2) fibonacci

And finally, here’s the Self version. Note that the following code already existed in the image, on traits Integer:

<= 1 ifTrue: [ self ]
      False: [ predecessor fibonacci + (- 2) fibonacci ]

I should try coding up Doug Bagley’s Language Shootout benchmarks in Self, just to see where it places overall. I should also try all the above on Intel x86 CPUs.

What an achievement Self is!

Midnight Inspiration

I am so far from sleep. It is not funny.

I have been thinking about this Smalltalk-TNG idea for a few hours now. I’ve been considering what it would take to make an efficientish interpreter/JIT/whatever for a language that is more-or-less a concurrent variant of Smalltalk ‘72, the one where each receiver does its own dispatch on the message it is sent.

  • Start from a π-calculus variant.

  • Messages are really messages, transmitted to the receiving process via a real channel, received by a real channel-input operation. The receiver is a real process, reading a message at a time and creating replicated input by explicit recursion. Sounds slow, doesn’t it? Read on.

  • Use the Squeak (and for all I know traditional Smalltalk) technique of hashconsing message selectors, taking the pointer to the selector, XORing it with the pointer to the class &mdash or whatever provides the method dictionary — and using that value as a hash into a global method cache/memoization table.

  • Clever representation under the hood pops out. The JITter/compiler/interpreter ought to be able to recognise common patterns of send/replywait, as in the normal pattern generated by a CPS transform. The system can optimise these special patterns. When it sees a CPS-transformed RPC pattern it can simply make a call, without bothering to reify the reply port. That can happen on the receiving side, should it be considered necessary.

  • The scheduler can do the traditional timeslicing thing if there is real parallelism. Once a sequential “thread” is over, the scheduler regains control automatically of course and can simply pop the next item off the work queue and keep on trucking sequentially.

  • SMP systems ought to communicate via message passing rather than shared memory. Each CPU becomes a real separate network-linked compute resource. Display and peripherals I guess have to be split across the SMP machine somehow — process migration? Object cloning, a la Obliq?

  • The representation of channels will be interesting. Since we don’t have SMP to worry about, we don’t need to lock — ever — so placing a (pseudo-) message on a channel is as simple as arranging the message and calling the function-pointer held within the channel itself!

  • For objects, or object-like structures, the functionpointer will process the message efficiently in order to determine the intended method, and will then pass the details on to the method closure. For anything else, the functionpointer will simply place the message on a queue if there’s no receiver waiting. If there is a receiver waiting, it might be an RPC-style receiver in which case another simple stack based call can be made; alternatively the continuation can be hooked up with the message and scheduled normally for the scheduler to get to in its own good time. At every level, the common-case pattern can be optimised with robust fallbacks for the less usual cases.

  • For the common case of a method-call-style-message-send-and-replywait, neither the message nor the reply port need be eagerly constructed. The calling convention for channel functionpointers has it so that on entry to the functionpointer, if a certain register is zero, say, then the message and waiting continuation are implicit in the parameter registers and the waiting stack frame. Should they need reifying, it’s obvious where to look and how to go about it; otherwise, let them stay implied, and when the reply appears for the continuation, simply do a stack return as usual for a sequential language. If the special register was nonzero, then a real message would be waiting and the system would fall back to a slower but more general message delivery and rendezvous implementation. This lets sequential code run sequentially.

  • Almost everything can be done late. As Alan Kay puts it in “The Early History of Smalltalk”, “The art of the wrap is the art of the trap”. Treat the common case as the fast path and trap on any deviation from the norm. Clever use of indirection and function pointers, combined with the metaphor of the receiver doing its own dispatching (which applies at multiple levels here — not only at the language metaphor level but also in terms of implementation details, with the channel semantics being represented by a functionpointer) can help automatically trap a lot of the time without special hardware support.

  • A note on these functionpointers: replicated input RPC-style processes, ie. functions (closures) can be efficiently represented as channels with a functionpointer that does the work of the closure! No message passing overhead at all! Note also that this applies equally to primitives as to user blocks (in the sense of a Smalltalk block, aka a lambda term).

  • When the scheduler runs out of work, that means the system has quiesced. Is it just waiting for an interrupt (user/network/disk activity), or has it deadlocked somewhere?

  • What about image saving? EROS style checkpointing? Can we get that reliable, efficiently? Have a small “emergencies-only” image in case you corrupt your mainline image… signed primitives can come in over the net and be installed by the user based on a web of trust… if you upgrade your CopyBits, and it starts crashing, boot into the emergency-only image and repair the damage.

  • Object models! The system supports traditional Smalltalk style class- (and metaclass-) based dispatch, but it equally efficiently supports Self- or Javascript-like prototype object dispatch, or Python-like class-based dispatch etc etc. Scheme/Lisp-style objects based on closures can also be efficiently represented. An annotation on code as a hint to the interpreter/JIT that a certain RPC can be memoised weakly (ie. in the global method cache) is sufficient for all these forms of dispatch. The one thing missing is multimethod dispatch — how might that work? The Slate project at tunes.org might have some literature on this.

  • How can conditionals be efficiently compiled? How does Smalltalk detect when #ifTrue:ifFalse: hasn’t been reimplemented in the receiver? There’s special bytecode for conditional jumps in Squeak.

  • Object representation — cons up whole vectors of fresh channels at once. These are used for instance variables.

Christ, it was thirty-two years ago Smalltalk ‘72 was being written and used. That’s a sobering thought. Well, now that I’ve got some of that stuff out of my head, perhaps I’ll be able to sleep.

StoJ, an implementation of Stochastic Join

Well, since I wrote about some of the issues in stochastic-π based biological modelling, I’ve had a little spare time, a little inspiration, and a supportive workplace environment (thanks again, LShift!), and I’ve implemented StoJ, a polyadic, asynchronous stochastic π calculus with input join and no summation. Rates are associated with a join, not with a channel. I implemented Gillespie’s “first reaction” method (it seemed simplest — and despite its inefficiency it’s performing reasonably well, although I suspect I have Moore’s “law” to thank for that more than anything else).

I used OCaml as the implementation language — the parsing tools are quite smooth and good to work with — and gnuplot to graph the output of each simulation run. This is my first OCaml project, and I’m finding it quite pleasant despite the syntax.

Here’s an example model in StoJ. I duplicated the rates and the overall design of the model from the Clock.zip code archive at the BioSPI website. This model is discussed in Aviv Regev’s paper “Representation and simulation of molecular pathways in the stochastic π-calculus.” The implementation of the model below gives the same output graphs as Regev’s model gives, which is a nice confirmation of the model and of the StoJ system itself.

// Circadian clock model
// Based on Regev et al

new
        DNA_a, DNA_r,
        !MRNA_a, !MRNA_r,
        !protein_a, !protein_r,
        DNA_a_promoted, DNA_r_promoted,
        !complex_ar . (

// Start state.
DNA_a<> | DNA_r<>

// Basal rate transcription.
| rec L. DNA_a() ->[4]       (DNA_a<> | MRNA_a<> | L)
| rec L. DNA_r() ->[0.001]   (DNA_r<> | MRNA_r<> | L)

// MRNA degradation.
| rec L. MRNA_a() ->[1.0]    (L)
| rec L. MRNA_r() ->[0.02]   (L)

// Translation.
| rec L. MRNA_a() ->[1.0]    (MRNA_a<> | protein_a<> | L)
| rec L. MRNA_r() ->[0.1]    (MRNA_r<> | protein_r<> | L)

// Protein degradation.
| rec L. protein_a() ->[0.1] (L)
| rec L. protein_r() ->[0.01]        (L)

// A/R Complex formation.
| rec L. protein_a() & protein_r() ->[100.0]     (complex_ar<> | L)

// A/R Complex degradation.
| rec L. complex_ar() ->[0.1]        (protein_r<> | L)
| rec L. complex_ar() ->[0.01]       (protein_a<> | L)

// Activator binding/unbinding to A promoter, and enhanced transcription
| rec L. protein_a() & DNA_a() ->[10]    (DNA_a_promoted<> | L)
| rec L. DNA_a_promoted() ->[10]         (protein_a<> | DNA_a<> | L)
| rec L. DNA_a_promoted() ->[40]         (DNA_a_promoted<> | MRNA_a<> | L)

// Activator binding/unbinding to R promoter, and enhanced transcription
| rec L. protein_a() & DNA_r() ->[10]    (DNA_r_promoted<> | L)
| rec L. DNA_r_promoted() ->[100]        (protein_a<> | DNA_r<> | L)
| rec L. DNA_r_promoted() ->[2]          (DNA_r_promoted<> | MRNA_r<> | L)

)

Here are a couple of graphs produced by the simulator from the program above — first a graph of reagent concentrations varying with time, and then a graph of Repressor (R) molecule concentration against Activator (A) molecule concentration, showing the bistable nature of the whole system. If you take a look at the graphs in Regev’s paper, you’ll see that the spikes in reagent concentration occur in the same places with the same frequency in our model as in Regev’s.

Concentrations against time

R vs A

I also implemented a simple Java applet/application that uses a (human-assisted) mass-spring layout to draw a graph of the (statically apparent) dataflow in the StoJ program it takes as input. It still needs a bit of refining — the graphs/programs tend to be quite densely connected in places, and not only is mass-spring layout poor at densely-connected graphs, especially with cycles, but it doesn’t identify any potential clustering in the graph either. I get the feeling clustering (subgraphs) will be important for visualising dataflow in larger models.

Here’s a simple chemical equilibrium model in StoJ:

// A simple H + Cl <--> HCl model,
// similar to that in Andrew Phillips' slides
// with numbers taken from his slides.
// See http://www.doc.ic.ac.uk/~anp/spim/Slides.pdf
// page 16.

new !h, !cl, !hcl . (
  rec ASSOC.  h() & cl() ->[100] (hcl<> | ASSOC)
| rec DISSOC. hcl() ->[10] (h<> | cl<> | DISSOC)
| h<> * 100 | cl<> * 100
)

And here’s the display generated by the Java program visualisation application:

Simple chemical equilibrium visualisation

As you can imagine, for larger models the display gets cluttered quite quickly. A note on implementation though - I found Java2D to be surprisingly pleasant to work with. The built-in support for double-buffering of JComponents along with the various classes of Shape helped me quickly develop the application.

Stochastic Join?

mikeb and I were discussing the papers given at BioConcur the other day, and we were puzzling out the differences between the standard way of modelling biochemical and biological reactions (including genetic interference reactions) and our “alternative” modelling style. I suppose the “alternative” style must have been tried before, and that it must display some obvious weakness we haven’t spotted yet.

Here’s a comparison of the two styles:

Biological Systems Element Standard Model "Alternative" Model
MoleculeProcessMessage
Molecule speciesChannel
Interaction CapabilityChannelProcess
InteractionCommunicationMovement (and computation)
Modification State and/or channel change State and/or channel change
Reaction Rate attached to channel attached to prefix
Join and Rates

We’ve been thinking about using a stochastic π-variant with a join operator. Having join available allows the expression of state machines to be much more direct than in a joinless stochastic π. We started modelling the lambda-switch to fit the model from this BioConcur paper, and saw that it seems more natural to attach the reaction rate to each individual join rather than to each channel as BioSPI does.

Bisimulation

The standard model has comm meaning interaction; we have comm meaning movement in some sense: atoms (mass) move from one locus (molecular species) to another, or a molecule undergoes conformational change, which is a kind of motion from one state or class or species to another. This means that bisimulation in the alternative model has to do with movement rather than interaction — which seems more fine-grained, but perhaps less natural.

Uncertainty in Biological Simulation

We also thought about a few properties biological simulation systems ought to have. One important one is a notion of uncertainty. Few of the current systems are honest about their limits — only one of the papers at BioConcur displayed error bars on any of its graphs! The systems in use all generate probable trajectories through state-space, and this is I assume implicitly to be understood by the reader of the various program outputs, but it would be useful to have the uncertainty in the results reflected in the various displays that these systems produce.

Besides the uncertainties that come from stochastic modelling and Monte-Carlo-type approaches in general, all the inputs to the systems (eg. rates) are purely empirical data, with associated uncertainties, and these uncertainties are not propagated through the system. Not only do the outputs not show the variation in results due to random sampling, but the outputs do not explore the limits of the accuracy of the inputs, either.

Reagent concentration vs. Gillespie vs. Modelling reagent state

The alternative model deals well with high reagent concentrations, since a high concentration is modelled as a large number of messages outstanding on a channel. The Gillespie scheduling algorithms are O(n)-time in the number of enabled reactions, not in the concentrations of the reagents. The standard model suffers here because each molecule is modelled as a process — that is, as an enabled reaction. In a naïve implementation, this makes the scheduling complexity O(n)-time (or at best, O(log(n)) time if the Gibson-Bruck algorithm is used) in the number of molecules in the system.

However, modelling molecules as processes seems to provide one advantage that doesn’t easily carry over into the alternative model: molecule-processes can involve internal state that any interacting molecule need not be concerned with. For example, enzyme-private channels with different rates associated with them might be used to model the different reactivities of phosphorylated and unphosphorylated states of the enzyme.

Perhaps a hybrid of the two models will be workable; after all, even in the alternative model, you have the freedom of the restriction operator available for implementing complex behaviours. The alternative model allows some common patterns to be expressed more simply than the standard joinless model though, it seems.

We’re still thinking about ways of keeping the molecules-as-messages approach while allowing reactions to specialise on molecule-message state or to ignore it, as appropriate.

Compartments

BioSPI 3.0 took the standard stochastic-π implementation and added BioAmbients to it (see the 2003 paper by Regev, Panina, Silverman, Cardelli and Shapiro). Any stochastic-join will probably benefit from being able to model compartments similarly.

The simple, ambientless approach suffers some of the same problems as SBML is facing with regard to recognising the similarity of species across compartments. Some way of factoring out reactions/processes/etc common across all compartments should be useful.

Summary of my Concur '04 experience

Last week I was privileged to be able to attend some of the workshops associated with Concur 2004, a conference on Concurrency Theory. Thanks, LShift!

(Don’t forget that, blog-style, the most recent posts are at the top, so the notes should be read upward from the bottom of each page.)

The BioConcur talks really provided some food for thought. I’m looking forward over the next few months letting some of the ideas from the conference take root, and I’ll be tracking some of the various projects with interest. It’s an exciting time to be in systems biology.

An Approach towards Stochastic Ambients

(I also caught a bit of the previous talk on semantic subtyping for π. It looks great! Also very codenameesque… anyway, to the talk at hand:)

WWW is the application considered - probabilities of event occurrences are important there (probability of, what, packet loss? server error?), as in bioambients.

The exponential distribution function is 1-e-rt

  • if you use the exp. dist. fn, you have a CTMC
  • and it’s memoryless

Ambients get decorated with a parameter η which is a scaling factor, controlling the rate of reaction inside the ambient. Also actions in, out etc. get decorated with a rate just like stochastic-π.

[How does that work with bioambients? Would the η be like volume there? What happens if you try to stuff a huge ambient inside a tiny one (in bio)?]

There’s no implementation yet. This is still very preliminary work.

Model Checking Functional and Performability Properties of Stochastic Fluid Models

Precise model-checking of models with continuous state variables is difficult. DTMCs and CTMCs, discrete and continuous time Markov Chains respectively, while able to optionally model continuous time cannot model a continuous state space, having only a discrete state space. (cf. PEPA) Markovian Reward Models have a strongly limited capability for modelling continuous state — limitations include for instance that the behaviour may not depend on the continuous state variable!

Stochastic Fluid Models do better [could it be applied to the RKIP/ERK pathway problem?] at the expense of significantly more difficult analysis. Only a very small number of continuous state variable can be handled.

I missed the end — had to leave to see the stochastic-ambients talk.

Towards a unifying CSP approach for hierarchical verification of asynchronous hardware

This talk was about using CSP within the Balsa toolkit to model asynchronous (clock-free) hardware. Asynchronous hardware replaces the clock with explicit synchronisation using handshakes/request + acknowledgement signals.

Clockless architectures are more efficient, cheaper in transistors, and they give more opportunity for verification. They’re harder to design and designs tend to be error-prone, though. This work is focussed on the idea that verification ought to be able to offset the difficulties of working with asynchronous systems.

At sub-micron technologies, the gates are so small the manufacturing process can be inaccurate enough to cause significant differences in timing delays within the chip! Hence per-chip clock-rate limits and testing… :-/

They’re compiling a CSPish language into silicon! Cool!

  • syntax-directed compilation
  • doesn’t pull extra parallelism out of the code
  • so writing the parallelism in is very important
  • which motivates the choice of CSP

CSPish language –> “handshake network” –> gate-level network –> …

Balsa has until now been a simulation-only tool, with no verification tool available. This work is building one.

  • Each interface between handshake-network components has a more-or-less session type

  • Different properties are verified at different levels - some at the whole-program balsa/csp level, some at the handshake-net level, and some at the gate network level

  • the top levels are synchronous models (ie. output has a continuation); lower levels are async however! so new approach used to verify async variants of csp: “protocol-based closed-circuit testing” – reminds me of behavioural type system zipper checkers! (you supply the environment and make sure it goes to zero…)

  • the case study reproduced a deadlock seen in the balsa simulator, and showed positively that the fix for the problem genuinely did remove the possibility of deadlock.

  • protocol at gate level is a subset of the traces of the gate net — because they have to cope with async sends at that level (?)

  • standard csp semantics, standard csp model-checkers. that is the contribution - no special tools required cf. Dill and cf. Josephs