At present, it is a “real mode” machine, and there is no support for running virtual uxn machines under an uxn hypervisor.
It’d be neat to have a virtualizable uxn system: you could write a full operating system, with memory protection and multi-tasking, entirely in uxntal, and existing ROMs wouldn’t even know the difference! Or you could write a debugger that made good use of the native uxn core where possible. You could serialize a program, suspending it to be resumed later, or transfer it across the network to a different host, all without having to modify the base uxn emulator.
So: Can we make uxn virtualizable, so that an uxn hypervisor can manage one or more child virtual machines, each running a program presented with a seamless illusion that it’s the only software on the machine? Let’s find out!
What is virtualizability?
In 1974, Gerald J. Popek and Robert P. Goldberg wrote a seminal paper laying out a simple and robust approach to determining whether a particular machine architecture is virtualizable. Quoting from the paper’s wikipage,
There are three properties of interest when analyzing the environment created by a VMM [a Virtual Machine Monitor, a.k.a. a Hypervisor]:
Equivalence / Fidelity. A program running under the VMM should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly.
Resource control / Safety. The VMM must be in complete control of the virtualized resources.
Efficiency / Performance. A statistically dominant fraction of machine instructions must be executed without VMM intervention.
Is real-mode uxn virtualizable?
Uxn as specified doesn’t have a supervisor-mode/user-mode split, so we will have to adapt Popek and Goldberg’s development.
Sometimes the uxn machine takes a trap, but its behavior is defined to transition to a metamachine, not to some supervisor mode. In case of errors, the metamachine will often reactivate the machine at some error handler vector, but again, it’s in normal operation mode, not some supervisor mode. This means that there are no privileged instructions in uxn.
There are no base-and-bounds registers, and the range of pointers exactly covers accessible RAM
(64k!), so there are no out-of-bounds RAM accesses possible. However, it is possible to
underflow or overflow the stack, which falls in the spirit of Popek and Goldberg’s
memorytrap. When this happens, the metamachine takes over.
DIV are the only
instructions which trap or fault otherwise than
There is no way to change the amount of available memory, and no mode changes possible, so there are no control sensitive instructions.
While there is no memory virtualization in uxn, consideration of behavior sensitive instructions requires us to posit an address translation facility. But we can easily make sure this doesn’t fall foul of Popek and Goldberg’s first (relocation) criterion of behavior sensitivity; and as there are no modes, their second criterion fails also. So there are no behavior sensitive instructions.
Thus, because all of the (nonexistent) control- and/or behavior-sensitive instructions fall in the (empty) set of privileged instructions, we can rely on Popek and Goldberg’s theorem 1, and say that construction of a VMM for uxn must be feasible. In fact, it’s almost there as it stands.
Reaching out further, to recursive virtualizability, is in my opinion the simplest path to virtualization. Execution of uxn is deterministic, given the devices in play today! Thus, by Popek and Goldberg’s theorem 2, uxn must be able to be recursively virtualizable.
Let’s explore one possible approach to actually virtualizing uxn.
At the foundation of a virtualization tree lies a real, physical machine, with real RAM and real I/O devices. That physical base machine has a physical RAM pointer. Every virtual machine in the tree implicitly accesses the physical RAM.
Each virtual machine also has private registers, and private memory. Recall we’re considering recursive virtualization, where the program a hypervisor supervises may in turn be a hypervisor.
- 32-bit base, relative to its parent’s memory allocation
- 32-bit bound (length of accessible memory)
- 16-bit PC
- a trapCode register
- Memory, all of which MUST be disjoint from the allowed base-and-bound region of RAM
- 256 bytes of device memory
- two 255-byte stacks, plus two 1-byte stack pointers
- two 32-byte device mask bitsets and one 16-entry × 16-bits device version vector
- a 16-byte trap description area
The base and bound are new. Machines do not have access to them; they are only able to access
the registers of “child” VMs. An exception is made to allow a VM to read its own bound
register. It does this via the system device “expansion” port, using an operation
The outermost VM has a base of 0 and a bound of however large the physical RAM is.
All memory-accessing operations (including loading an instruction) take the base and bound into account. Accesses falling outside physical RAM or outside the base-and-bound cause a memory fault.
The device mask bitsets still control device effects, but instead of calling a
handler routine as a subroutine of the VM’s evaluator, the evaluator traps, yielding control to
the parent VM, which performs the requested action.
Every time a VM stops execution, its trapCode and trap description area are updated to describe the cause of the stop.
System device expansion operations
Adding the following operations is sufficient to allow writing a hypervisor in uxntal:
|getBound||#10||&boundHi $2 &boundLo $2|
The handler retrieves the 32-bit active bound on memory access, overwriting the four bytes following the operation code.
The handler expects a VM Control Block (VMCB) to exist at
|field name||offset||size (bytes)||description|
|parentLink||0||4||scratch space for control transfer|
|baseHi||4||2||high short of base register|
|baseLo||6||2||low short of base register|
|boundHi||8||2||high short of bound register|
|boundLo||10||2||low short of bound register|
|trapDescription||16||16||trap detail record|
|deiMask||32||32||device input mask bitset|
|deoMask||64||32||device output mask bitset|
|devVers||96||32||device version vector|
|workStack||256||255||work stack base|
|workStackPtr||511||1||work stack pointer|
|returnStack||512||255||return stack base|
|returnStackPtr||767||1||return stack pointer|
The operation performs a context switch to the described VM. In the following,
the VMCB pointed to by
curr the running VM’s VMCB.
The absolute base
absBase is computed from the running VM’s base register and the base
absBase plus the
next’s bound lies outside the
curr takes a memory fault. If any part of the
next VMCB overlaps with the
region addressed by
curr takes a memory fault.1
If these checks all pass,
curr’s VMCB address is placed in
parentLink (it will not be
visible to any VM; see below) and execution resumes in
Traps and faults
When a running VM takes a trap or a fault, it transfers control to its parent VM. The parent
VM’s address is retrieved from the
parentLink field of the running VMCB (
curr). If it is 0,
there is no parent, and this is the outermost VM; the pre-virtualization behavior applies.
trapDescription are set appropriately, its
field is cleared (ensuring that the VM cannot learn anything about its absolute location), and
the parent VM is resumed.
The effect of
BRK is altered: it takes a trap with
BRK and no description.
The effects of
DEO are also altered. Instead of invoking a
inside the evaluator, the machine takes a trap with
trapCode set to
IO. The suspended VM’s
PC is left pointing after the
DEO instruction; the arguments to
included in the
Memory and stack faults
When an out-of-bounds access occurs, the stack underflows, or the stack overflows, the machine
takes a fault with an appropriate
trapCode, and with
trapDescription including details of
the fault. The machine’s PC and stack are left alone so that if the machine is immediately
restarted, it will precisely repeat the fault. This could be used to e.g. provide a (small)
virtual memory, or an (unbounded) virtual stack.
I have implemented emulator support for virtualization, as sketched above,
here. I’ve also written a simple hypervisor
xun (like Xen, see?). To see
the code, check out the
virtuxn3 branch of the
https://gitlab.com/tonyg/uxn git repository.
The implementation is done in two stages:
the first stage (diffs from
virtuxn3-novirt) splits out “MMU” and IO “processors” into a “system-on-chip”,
uxnsoc.[ch], updating the various core implementations to use the slightly-different interface to memory and device IO.
the second stage (diffs from
virtuxn3) is simpler, providing an alternative compatible “system-on-chip” implementation supporting the new system extension commands and providing the necessary machinery to swap in and out of nested VMs as required.
xun, is a little
program that is currently hardcoded to load a particular ROM,
bin/xuninner.rom. It cannot yet
recursively run itself running another program, and lacks configurability. It also doesn’t do
anything about device multiplexing, like a real multi-tasking operating system would have to
do. But the basics are there, enough to show it working: it can, for example, run
Differences between direct and virtualized execution
Because the devices take virtual addresses, rather than physical addresses, a supervisor
has to copy down media such as sprites and audio samples before asking the hardware to use
them. This leads to noticeable differences in behavior: running
piano.rom, sounding a note,
and then scribbling on the waveform causes the playing note to change its timbre. When running
the same program under
xun, however, the note stays the same; the edit to the timbre doesn’t
take effect until the next note is played.
One change that could ameliorate this would be to make the various addresses used in device memory into 32-bit physical addresses (16 bit page / 16-bit offset). That way, translated addresses from nested VMs could be used without having to copy them. (A second benefit would be the ability for ROMs to use media outside the bottom 64k without having to copy them first.)
It’d be nice to adapt the devices to support 32-bit addresses for sprite and audio sample data
etc. It can be done backward-compatibly; I need to write down the approach I have in mind.
There’s also a bit of implementation work remaining in
xun to allow full recursive
virtualization; it’s, in some sense, a Trivial Matter of Programming, but the details are
actually a little more fiddly than they have to be and perhaps some changes to the device
interfaces could make things easier for hypervisors without complicating life for simple
real-mode ROMs. There’s nascent support in baseline
uxn for self-describing devices: I’d like
to see this fleshed out a bit so that nested VMs could reliably discover the features made
available to them by their hypervisor. Ultimately, it’d be amazing to build something a little
System-7 like using virtualized uxn for running programs. The nested VM interface could even be
widened to include a “fuel counter” of instructions before a “fuel exhausted” trap, allowing
All in all, uxn virtualizes very nicely. Writing a hypervisor entirely in uxntal works well with minimal extensions to the VM specification. I’m looking forward to seeing how things develop!
A corollary of this is that a nested VM’s addressable memory must always be smaller (at least by the size of one VMCB) than its parent’s addressable memory. ↩