Partial Evaluation

This afternoon I implemented a simple partial evaluator for a Scheme-like language, based on the paper “Partial Evaluator as a Compiler for Reflective Languages” (Asai, Masuhara, Matsuoka and Yonezawa, 1995). So far I’ve restricted myself to an IO-free, side-effect free language, but by using the notion of preactions I ought to easily be able to add support for IO. I’d like to explore extending the simple system I have now to cope with the Composable Memory Transactions of Harris et al instead of supporting Scheme’s destructive updates.

The addition of pattern-matching data-destructuring will help the partial evaluator out, too, in the same way it makes dealing with primitive data easier for the Spineless Tagless G-Machine (see section 4.7). For example, using the partial evaluator to specialize the expression

(fold + 0 a)

I currently get

(letrec ((g17 (lambda (x acc)
                (if (null? x)
                  acc
                  (GOTO g17
                        (if (pair? x)
                          (PRIMcdr x)
                          (error '"Not a pair in cdr" x))
                        (+ (if (pair? x)
                             (PRIMcar x)
                             (error '"Not a pair in car" x))
                           acc))))))
  (g17 a '0))

(Ignore the strange GOTO syntax — I’m also experimenting with identifying the places I can transform recursive procedures into iterative loops.) Note that the type test (pair? x) is repeated twice. If some kind of match clause were used instead, we would see the test lifted to wrap the (GOTO g17 ...) expression, leaving us with something more like

(letrec ((g17 (lambda (x acc)
                (if (null? x)
                  acc
                  (if (pair? x)
                      (GOTO g17 (PRIMcdr x) (+ (PRIMcar x) acc))
                      (error '"Not a pair in match" x))))))
  (g17 a '0))

which, when compiled, is about as efficient as you will get with a statically compiled Scheme. Lots of the optimizations developed in the Self system apply to dynamically compiled variants, of course, potentially leading to further efficiencies.

Syntactic Closures vs. syntax-case

On this page about syntactic closures, we see the definition of an anaphoric-if macro that doesn’t transmit the “it” binding to the alternative clause — just to the consequent. This is pretty sophisticated. I immediately began to think about how you could do the same with syntax-case (as clearly others had before me) and it turns out to actually be easier. The only real downside is the introduction of another binding — but of course the mythical Sufficiently Smart Compiler will optimise that away immediately anyway.

Chicken Cairo 0.1.2 alpha

"Release early, release often," they say. In this spirit, another release of Chicken Scheme bindings for Cairo is available in our 2005 archive.

The egg file is available in releases.

2005-01-06 18:15:03 GMT Michael Bridgen <mikeb@lshift.net>      patch-8

    Summary:
      Added text and font extents
    Revision:
      chicken-cairo--dev--0.1--patch-8

    - cairo_text_extents_t and cairo_font_extents_t types
      These are not returned from a function, so we have to allocate
      them as a (GC-able) byte vector and make C think it's a spot of
      memory.
    - Functions for determining the extents of text and fonts.
    - Updated example



2004-12-04 17:52:21 GMT Tony Garnock-Jones <tonyg@lshift.net>   patch-7

    Summary:
      Adapt to chicken-sdl's recent case changes, and upcase the cairo binding too
    Revision:
      chicken-cairo--dev--0.1--patch-7

    A few constants changed to UPPER_CASE, both references to SDL
    constants, and the flags defined in cairo.scm. Also removed test-cairo
    from the Makefile - it doesn't need to be compiled now that we have
    sdl-csi.


2004-11-30 00:40:37 GMT Michael Bridgen <mikeb@squaremobius.net>        patch-6

    Summary:
      Font and text rendering
    Revision:
      chicken-cairo--dev--0.1--patch-6

    Just a start on these -- enough to see something painted, which is exciting.


2004-11-17 19:40:06 GMT Tony Garnock-Jones <tonyg@kcbbs.gen.nz> patch-5

    Summary:
      cairo.setup file, to make 'make install' (and the egg) work
    Revision:
      chicken-cairo--dev--0.1--patch-5



2004-11-15 22:33:03 GMT Michael Bridgen <mikeb@squaremobius.net>        patch-4

    Summary:
      Query functions
    Revision:
      chicken-cairo--dev--0.1--patch-4

    Might come in handy ('specially alpha)


2004-11-15 14:02:26 GMT Michael Bridgen <mikeb@squaremobius.net>        patch-3

    Summary:
      Matrix operations (untested)
    Revision:
      chicken-cairo--dev--0.1--patch-3

    Added the cairo_matrix_* functions -- untested, except for the
    compilation unit test.

    Just a bunch of typing really.


2004-11-11 00:19:12 GMT Michael Bridgen <mikeb@squaremobius.net>        patch-2

    Summary:
      Some documentation and decorations
    Revision:
      chicken-cairo--dev--0.1--patch-2

    Added some documentation regarding obscure/non-obvious things;
    divided into sets of functions with headers for navigation.

GNU/Arch archives moved

I’ve cycled our GNU/arch archives to 2005, meaning that the previous archive link is wrong. To correct:

tla register-archive http://www.eighty-twenty.org/archives/2004
tla register-archive http://www.eighty-twenty.org/archives/2005

A Web view for zowie

The aim of Fuschia was to write a clean, Python-based, blosxom-like blogging tool. As soon as I’d started though, I got grand plans and lost my way. Now I’ve turned back to it as a testing ground for some zowie ideas.

I branched Fuschia to try making a web site by projecting an RDF model along semantic relations. This is very zowie-ish, so I’m going to write about my experiments here.

The view

Here I ran into the dilemma of having either a single model, and constructing views using a projection and a subject (a subject-centric view); or having a graph of views and the models distributed among them (a state-centric view).

A zooming user interface may have a locus of attention (either explicitly as in an insertion point, or guessed from the input device state, like mouse position) but it may not always have an explicit subject aside from the world. Hence it’s difficult to have a subject view; however, you can on the other hand, let the state of the view decide this for you — rendering the view becomes:

  1. start at the top-level view, and find the top-level subject
  2. project the semantic relations (to the current subject) onto spatial relations
  3. for each related subject, make an appropriate view and go to 1.
  4. composite the responses according to the spatial relations

Naturally, the state of a view along with the spatial relations in the projection will mean that some related subjects are outside the clipping area, too small, &c., so these won’t be rendered.

A Web site always has an explicit subject, given by the URL. Our algorithm is very similar to the state-centric case:

  1. make the current subject the one requested
  2. project the semantic relations to the current subject onto Web page relations
  3. for each related subject, make an appropriate view and go to 1.
  4. compose the response HTML according to the Web page relations

It looks like a Web site (in this experiment) approximates the smoothly interactive interface with discrete, subject-sized steps. The state of the view is implicit in the explicit subject, rather than subjects being interpolated between by the view state.

The model

What do we want from the model? In the MVC sense, a model fulfils two roles: being a value cell, and being observable (letting the view know when the value changes). Since I don’t initially have a controller changing the model, I can not worry so much about being observable, and just hold a value. Primarily, I need to know semantic relations between subjects (resources) and objects (values). These are naturally expressed (because it suits my purpose) as a RDF triples database. My model for a particular subject is the set of relation-arcs leading outwards; for this we really only need a reference to the triples database, the subject itself, and a way to query the database.

I started by representing some facts about (say) a part of the filesystem. We have to know what form these facts are going to be for us to reason using them later, so I use a standard set of facts about containment and names of things. From this basic set of explicit relations and an inference procedure, I can infer other relations such as ancestry, transitive containment, and so on. We can also use these to project our filesystem facts into semantic relations (‘classified-as’), and our semantic relations into view-specific relations (‘linked-to’).

Style and templates

Assumed in the above algorithms is a procedure for determining the appropriate view for a subject. Here I have a number of options, including:

Have a statically configurable map of subject type to view. This is essentially a view factory, and conceptually straight-forward so far as it goes. It leaves the matter of templates alone — concievably some views would use some kind of templates, which leads to coupling between the template language and the view projection.

Use context and selector-matching to determine the view (or style) to use, in the manner of CSS or GSS stylesheets. This is really a refinement of the above; however, since we have context, we can stick to styles and simply have some styles that rule out rendering a subject. We don’t have to resort to control structures or even have templates.

Direct the view rendering through templates entirely, using selectors and control structures, in the manner of XSLT. Although this is a nice pun on XSLT, and could even be workable using XSLT, it is more or less what I’m trying to avoid. I want to extract the relations implied in such templates and keep them explicitly in one place — the projection.

Transactional Memory

I just read “Composable memory transactions” (Harris, Marlow, Peyton-Jones, Herlihy) on Friday. It’s really very similar to what Henry Baker was talking about and what I had in mind for the transactionality of mutable state in ThiNG, but it’s different enough to make me consider changing ThiNG to be more like the language they discuss in the paper. The advantage is that you can build communication primitives out of transactional shared memory - see their discussion of MVars, which are essentially identical to the cells I’ve been considering for ThiNG - whereas it’s more difficult to do things the other way around.

Unrolling Recursion, the Golden Ratio, and Partial Evaluation

The little language shootout I ran a few weeks ago showed that Java had a significant edge over the other contestants. I speculated at the time that it might have been because Java was unrolling the fib loop once, inlining one of the recursive invocations, which would have removed one layer from the call tree - essentially halving the runtime. (More accurately, dividing the runtime by φ = 1.618…)

I just now ran the same program unrolled once by hand, and with the unrolled version the C code has almost identical (well within the margin of error!) runtime to the Java code. Here’s the faster C code:

static unsigned long fib(int n) {
  if (n < 2) {
    return 1;
  } else {
    if (n < 3) {
      return 2;
    } else {
      return fib(n - 1) + fib(n - 2);
    }
  }
}

Essentially, it’s inlining that recurses once, allowing a self-recursive procedure to inline one layer of self-invocation. It’s a neat trick - it’s discussed in section 7.1.2.2 of Craig Chambers’s thesis on the earlier optimizing Self compiler. Self only inlines once (i.e. nonrecursively) - clearly, Java inlines twice occasionally.

Further unrolling the recursion gives commensurate speedups, as you would expect: each unrolling divides the runtime by φ, roughly, so you can easily speed up the C code by about a factor of ten compared to the completely non-unrolled version by unrolling until you return 13 constantly after an explicit test that n < 7.

All this inlining/unrolling/constant-propagation seems really closely related to partial evaluation, too… Hidehiko Masuhara’s PhD thesis (U. Tokyo, 1999) is an interesting read, applying partial evaluation to reflective interpreters to give vastly improved performance for behaviourally-reflective languages. Integrating a partial evaluation system with a Self-like dynamic optimising compiler seems like a worthwhile project - something that is likely to be necessary for the implementation of ThiNG, perhaps, given how reflective I imagine it will end up being.

Clipping in Cairo

While doing some hacking on Zowie, I noticed I was getting drastic slowdown when drawing large, clipped objects — typically when something is almost in the zooming-center and skids off to the side while very ‘close’ to the lens.

It turns out that clipping can be quite expensive in Cairo, under certain circumstances. When the clipping is complex or not aligned on pixel boundaries, Cairo resorts to using an intermediate buffer and compositing back to the surface in the context.

I have a couple of solutions on my plate:

  1. Make sure clipping paths are rectangular and pixel-aligned
  2. Don’t bother even trying to draw things that would be fully clipped

I’ve implemented the latter just by testing if bounding boxes overlap the clipping region. Just this makes a huge difference! Cairo seems to keep up with painting plenty of shapes (rectangles, admittedly), so long as the large ones end up mostly off-screen. It is possible that there could be large, partially clipped shapes that don’t comply and stay in view (it doesn’t tend to happen in my example code at the minute); for that reason, I might need to experiment with the latter solution. It’s a bit tricky because clipping regions can be at the bottom of a stack of transforms, so to get pixel co-ordinates I need to transform all the way up to world co-ordinates, apply the clip rectangle, then go back to where I was.

(BTW I do plan to make some of this hacking available eventually — my first target is to have zooming with the mouse and an editable text widget, and they are both in reach ..)

Reflection and Message-Oriented Programming

My thinking about a next-generation SmallTalk-like system and language has been shifting a bit over recent weeks.

To start with, I decided that the objects in the language would be immutable: in order to replace a field value, an entirely new object would be constructed, just like records in ML. Objects would no longer have any identity, and object equivalence would be decided by recursive comparison (a la Henry Baker’s egal predicate [Baker93]). This immutability would extend even to the object’s map (similar to its class) - adding a method to an object results in a fresh object, with a fresh map.

Mutable state isn’t entirely absent, though - it’s just kept strictly separate from other parts of the system, just as it is in ML. It would be possible to construct mutable reference cells. Synchronisation, communication and state access would be merged: when you retrieve a value from a cell, the value is removed from the cell and handed to the retrieving process. If no value is present, the receiving process blocks until one is sent by another process. If a sending process tries to place a value in a cell already occupied by a value, the sending process dies with an exception. Cells are the locus of object identity in the system, again as per [Baker93].

Metaprogramming and reflection would be enabled via locations, which group together a set of related processes into a location tree. Locations are responsible for the semantics of message dispatch and exception handling. They’re the basic unit of reflection, too - a location can be reified, which pauses and reifies all contained processes and sublocations. A reified location can be used for debugging, for mobile code, for become:-like operations, and many other things. A user can install user code at a new sublocation, which allows refinement or replacement of the default message dispatch behaviour in the style of [Malenfant92].

Code itself in the system is a distinct entity - the instruction stream contained in a method is a different kind of thing from all of the categories discussed so far. It’s the role of the location to interpret the code stream.

The dynamic state of a computation is held at the metalevel as a process. Processes correspond to the state of a particular interpreter: the registers, the stack, the current continuation etc. They’re only accessible by reflecting on a location.

To sum up, then:

  • objects are immutable, and have no identity;
  • cells are mutable, have identity, and are the means of communication and synchronisation in the system;
  • locations are metalevel constructs that serve as interpreters for code, that specify message dispatch and exception-handling algorithms, and that are the loci of reflection in the system.
  • code is a stream of instructions intended for interpretation by locations.
  • processes are computations in the system running some code at a location, manipulating cells and constructing and transmitting objects.
Shifting from Object-Oriented to Message-Oriented

The way locations and code are laid out suggests strongly the infinite tower of reflective interpreters discussed in [Jefferson92]. This tower of interpretation, taken with the immutability of objects and the similarity of cells to π-calculus ports, starts to make the system look more like a message-oriented system than an object-oriented system.

The object-orientation is still present since we still late-bind code to message sends, but the emphasis has changed: not only is there no longer any behaviour necessarily attached to the objects - all the behaviour is external, in the code resolved by the message-dispatch algorithm in use - but there is no longer necessarily any state associated with the objects either!

Objects in the system start to look more like messages than objects. A collection of messages is bundled up with a selector and sent to the metaobject for message dispatch, and a piece of code specialised for handling that combination of arguments is selected and invoked.

Smalltalk Self Slate ML (SML, OCaml) π-calculus ThiNG
Language entities Objects, Classes, Code, Method Contexts, Block Contexts Objects, Code, Method Contexts, Block Contexts Objects, Code, Method Contexts, Block Contexts Tuples, Reference Cells, Functions, Evaluation Contexts Messages, Channels, Processes Messages, Channels, Processes/Code, Locations
Transfer of control lookup/apply lookup/apply lookup/apply apply message-send lookup/message-send
Kind of lookup single dispatch single dispatch multiple dispatch multiple dispatch
Reflective ability full structural reflection; partial behavioural reflection full structural reflection; partial behavioural reflection (?) full structural reflection; partial behavioural reflection no reflection no reflection full structural, behavioural and lexical reflection

(Aside: I find it interesting that a lot of OO thinking seems to implicitly assume that in an OO system everything is an object when that’s clearly not the case. Not only are there other entities in the language - code, method contexts, and block contexts, for instance - but they are metalevel entities, and tend not to be first-class citizens. Their reified representations may be objects, but the entities themselves are not. If you write down expressions in an object calculus, you end up with things in the expressions that aren’t objects.)

I can’t decide whether to stick with the evaluate-every-argument-in-parallel model or not; it seems that there are three obvious things that could work:

  1. Evaluate every argument in parallel (just like the current prototype). This is very inefficient on current CPUs. It means that the system automatically exposes a lot of fine-grained concurrency, though, which is nice.

  2. Evaluate only annotated arguments in parallel (just like Slate). This gives the programmer control over how much concurrency they want in their program. Not so much fine-grained concurrency is exposed, but on the other hand the code generated could be quite efficient.

  3. Tell the programmer to expect that every argument will be evaluated in parallel, but secretly evaluate most of them in serial. Some finessing will be required to avoid deadlocks caused by overeager serialization of intercommunicating branches; one rough guideline could be to serialize provably noncommunicating parallel branches only up to the end of inlining. Once a call proceeds with a real out-of-line call frame, act as for the every-argument-in-parallel case. (I don’t know how to prove noncommunication, yet, either. I haven’t really thought about it yet.)

This concurrency business is looking more and more like the Next Big Thing: here’s an interesting article spelling out the trends and the coming end of the clock-speed-increase “free lunch”. (That link via LTU).

Message-Oriented Programming

It turns out that the term “Message-Oriented Programming” isn’t new - there’s an existing body of work using the term, sometimes in the way I’d like to use it, sometimes in a related but different sense:

References

Chicken SDL v0.3p2

A new release of chicken-sdl is available from our arch archive. You can download the egg file here.

Interesting changes include:

  • Case changes: constant definitions are now in uppercase, matching the C API.
  • The SDL egg now supports SDL_net networking.
  • The sdl-csi program is now installed in the bin directory, as part of the normal installation of the egg
  • Added missing sdl-quit-sub-system parameter; added sdl-set-color-key! (thanks to Felix Winkelmann)
  • Added a sdl-pixel-format-bytes-per-pixel accessor (thanks to Sunnan)
  • Added getter/setter for SDL errors.
  • The sdl-rect-nil global has been replaced by #f.
  • The problem on MacOS X (and probably also Windows) regarding sdl-init has been better documented (in the README)