eighty-twenty news feed for tag "systems-biology"2024-01-24T14:37:46+00:00http://eighty-twenty.org/tag/systems-biology/index.atomtonygmikebStoJ gets a home of its own2005-04-14T10:04:32+00:00http://eighty-twenty.org/2005/04/14/stojtonyg<p>We’ve just set up a <a href="http://www.lshift.net/stoj.html">page for
StoJ at <tt>lshift.net</tt></a>, subsuming the <a href="/tech/systems-biology/join-20040904.html">previous</a>
<a href="/tech/systems-biology/stoj-20040914.html">two</a>
stories about StoJ and providing a link to a running implementation
that can be tried out over the web.</p>
StoJ, an implementation of Stochastic Join2004-09-14T23:02:37+00:00http://eighty-twenty.org/2004/09/15/stojtonyg<p>Well, since I <a href="/tech/systems-biology/join-20040904.html">wrote</a>
about some of the issues in stochastic-π based biological modelling,
I’ve had a little spare time, a little inspiration, and a supportive
workplace environment (thanks again, <a href="http://www.lshift.net">LShift</a>!), and I’ve implemented StoJ,
a polyadic, asynchronous stochastic π calculus with input join and no
summation. Rates are associated with a join, not with a channel. I
implemented Gillespie’s <a href="http://www.impg.prd.fr/~IHP/Kierzek/stoch_kin.ppt">“first
reaction” method</a> (it seemed simplest — and despite its
inefficiency it’s performing reasonably well, although I suspect I
have Moore’s “law” to thank for that more than anything else).</p>
<p>I used <a href="http://www.ocaml.org/">OCaml</a> as the
implementation language — the parsing tools are quite smooth and
good to work with — and <a href="http://www.gnuplot.info/">gnuplot</a> to graph the output of
each simulation run. This is my first OCaml project, and I’m finding
it quite pleasant despite the syntax.</p>
<p>Here’s an example model in StoJ. I duplicated the rates and the
overall design of the model from the <tt>Clock.zip</tt> code archive
at the <a href="http://www.wisdom.weizmann.ac.il/~biospi/index_main.html">BioSPI</a>
website. This model is discussed in Aviv Regev’s paper “Representation
and simulation of molecular pathways in the stochastic π-calculus.”
The implementation of the model below gives the same output graphs as
Regev’s model gives, which is a nice confirmation of the model and of
the StoJ system itself.</p>
<pre class="code">
// Circadian clock model
// Based on Regev et al
new
DNA_a, DNA_r,
!MRNA_a, !MRNA_r,
!protein_a, !protein_r,
DNA_a_promoted, DNA_r_promoted,
!complex_ar . (
// Start state.
DNA_a<> | DNA_r<>
// Basal rate transcription.
| rec L. DNA_a() ->[4] (DNA_a<> | MRNA_a<> | L)
| rec L. DNA_r() ->[0.001] (DNA_r<> | MRNA_r<> | L)
// MRNA degradation.
| rec L. MRNA_a() ->[1.0] (L)
| rec L. MRNA_r() ->[0.02] (L)
// Translation.
| rec L. MRNA_a() ->[1.0] (MRNA_a<> | protein_a<> | L)
| rec L. MRNA_r() ->[0.1] (MRNA_r<> | protein_r<> | L)
// Protein degradation.
| rec L. protein_a() ->[0.1] (L)
| rec L. protein_r() ->[0.01] (L)
// A/R Complex formation.
| rec L. protein_a() & protein_r() ->[100.0] (complex_ar<> | L)
// A/R Complex degradation.
| rec L. complex_ar() ->[0.1] (protein_r<> | L)
| rec L. complex_ar() ->[0.01] (protein_a<> | L)
// Activator binding/unbinding to A promoter, and enhanced transcription
| rec L. protein_a() & DNA_a() ->[10] (DNA_a_promoted<> | L)
| rec L. DNA_a_promoted() ->[10] (protein_a<> | DNA_a<> | L)
| rec L. DNA_a_promoted() ->[40] (DNA_a_promoted<> | MRNA_a<> | L)
// Activator binding/unbinding to R promoter, and enhanced transcription
| rec L. protein_a() & DNA_r() ->[10] (DNA_r_promoted<> | L)
| rec L. DNA_r_promoted() ->[100] (protein_a<> | DNA_r<> | L)
| rec L. DNA_r_promoted() ->[2] (DNA_r_promoted<> | MRNA_r<> | L)
)
</pre>
<p>Here are a couple of graphs produced by the simulator from the
program above — first a graph of reagent concentrations varying
with time, and then a graph of Repressor (R) molecule concentration
against Activator (A) molecule concentration, showing the bistable
nature of the whole system. If you take a look at the graphs in
Regev’s paper, you’ll see that the spikes in reagent concentration
occur in the same places with the same frequency in our model as in
Regev’s.</p>
<p><img alt="Concentrations against time" src="/images/tech/systems-biology/stoj-20040914/vstime.png" /></p>
<p><img alt="R vs A" src="/images/tech/systems-biology/stoj-20040914/hysteresis.png" /></p>
<p>I also implemented a simple Java applet/application that uses a
(human-assisted) mass-spring layout to draw a graph of the (statically
apparent) dataflow in the StoJ program it takes as input. It still
needs a bit of refining — the graphs/programs tend to be quite
densely connected in places, and not only is mass-spring layout poor
at densely-connected graphs, especially with cycles, but it doesn’t
identify any potential clustering in the graph either. I get the
feeling clustering (subgraphs) will be important for visualising
dataflow in larger models.</p>
<p>Here’s a simple chemical equilibrium model in StoJ:</p>
<pre class="code">
// A simple H + Cl <--> HCl model,
// similar to that in Andrew Phillips' slides
// with numbers taken from his slides.
// See http://www.doc.ic.ac.uk/~anp/spim/Slides.pdf
// page 16.
new !h, !cl, !hcl . (
rec ASSOC. h() & cl() ->[100] (hcl<> | ASSOC)
| rec DISSOC. hcl() ->[10] (h<> | cl<> | DISSOC)
| h<> * 100 | cl<> * 100
)
</pre>
<p>And here’s the display generated by the Java program visualisation
application:</p>
<p><img alt="Simple chemical equilibrium visualisation" src="/images/tech/systems-biology/stoj-20040914/StojVis-cropped.png" /></p>
<p>As you can imagine, for larger models the display gets cluttered
quite quickly. A note on implementation though - I found Java2D to be
surprisingly pleasant to work with. The built-in support for
double-buffering of <tt>JComponent</tt>s along with the various
classes of <tt>Shape</tt> helped me quickly develop the application.</p>
Stochastic Join?2004-09-14T21:49:10+00:00http://eighty-twenty.org/2004/09/14/jointonyg<p><tt>mikeb</tt> and I were discussing the papers given at BioConcur the
other day, and we were puzzling out the differences between the
standard way of modelling biochemical and biological reactions
(including genetic interference reactions) and our “alternative”
modelling style. I suppose the “alternative” style must have been
tried before, and that it must display some obvious weakness we
haven’t spotted yet.</p>
<p>Here’s a comparison of the two styles:</p>
<table class="columnar">
<tr>
<th>Biological Systems Element</th>
<th>Standard Model</th>
<th>"Alternative" Model</th>
</tr>
<tr><td>Molecule</td><td>Process</td><td>Message</td></tr>
<tr><td>Molecule species</td><td>—</td><td>Channel</td></tr>
<tr><td>Interaction Capability</td><td>Channel</td><td>Process</td></tr>
<tr><td>Interaction</td><td>Communication</td><td>Movement (and computation)</td></tr>
<tr>
<td>Modification</td>
<td>State and/or channel change</td>
<td>State and/or channel change</td>
</tr>
<tr>
<td>Reaction Rate</td>
<td>attached to channel</td>
<td>attached to prefix</td>
</tr>
</table>
<h5 id="join-and-rates">Join and Rates</h5>
<p>We’ve been thinking about using a stochastic π-variant with
a <em>join</em> operator. Having join available allows the expression
of state machines to be much more direct than in a joinless stochastic
π. We started modelling the lambda-switch to fit the model from <a href="/tech/concur2004/bioconcur/lambda-switch.html">this</a>
BioConcur paper, and saw that it seems more natural to attach the
reaction rate to each individual join rather than to each channel as
BioSPI does.</p>
<h5 id="bisimulation">Bisimulation</h5>
<p>The standard model has comm meaning interaction; we have comm
meaning <em>movement</em> in some sense: atoms (mass) move from one
locus (molecular species) to another, or a molecule undergoes
conformational change, which is a kind of motion from one state or
class or species to another. This means that bisimulation in the
alternative model has to do with movement rather than interaction
— which seems more fine-grained, but perhaps less natural.</p>
<h5 id="uncertainty-in-biological-simulation">Uncertainty in Biological Simulation</h5>
<p>We also thought about a few properties biological simulation
systems ought to have. One important one is a notion
of <em>uncertainty</em>. Few of the current systems are honest about
their limits — only one of the papers at BioConcur displayed
error bars on any of its graphs! The systems in use all generate
probable trajectories through state-space, and this is I assume
implicitly to be understood by the reader of the various program
outputs, but it would be useful to have the uncertainty in the results
reflected in the various displays that these systems produce.</p>
<p>Besides the uncertainties that come from stochastic modelling and
Monte-Carlo-type approaches in general, all the inputs to the systems
(eg. rates) are purely empirical data, with associated uncertainties,
and these uncertainties are not propagated through the system. Not
only do the outputs not show the variation in results due to random
sampling, but the outputs do not explore the limits of the accuracy of
the inputs, either.</p>
<h5 id="reagent-concentration-vs-gillespie-vs-modelling-reagent-state">Reagent concentration vs. Gillespie vs. Modelling reagent state</h5>
<p>The alternative model deals well with high reagent concentrations,
since a high concentration is modelled as a large number of messages
outstanding on a channel. The Gillespie scheduling algorithms are
<i>O(n)</i>-time in the number of enabled reactions, not in the
concentrations of the reagents. The standard model suffers here
because each molecule is modelled as a process — that is, as an
enabled reaction. In a naïve implementation, this makes the scheduling
complexity <i>O(n)</i>-time (or at best, <i>O(log(n))</i> time if the
Gibson-Bruck algorithm is used) in the number of molecules in the
system.</p>
<p>However, modelling molecules as processes seems to provide one
advantage that doesn’t easily carry over into the alternative model:
molecule-processes can involve internal state that any interacting
molecule need not be concerned with. For example, enzyme-private
channels with different rates associated with them might be used to
model the different reactivities of phosphorylated and
unphosphorylated states of the enzyme.</p>
<p>Perhaps a hybrid of the two models will be workable; after all,
even in the alternative model, you have the freedom of the restriction
operator available for implementing complex behaviours. The
alternative model allows some common patterns to be expressed more
simply than the standard joinless model though, it seems.</p>
<p>We’re still thinking about ways of keeping the
molecules-as-messages approach while allowing reactions to specialise
on molecule-message state or to ignore it, as appropriate.</p>
<h5 id="compartments">Compartments</h5>
<p>BioSPI 3.0 took the standard stochastic-π implementation and added
BioAmbients to it (see the 2003 paper by Regev, Panina, Silverman,
Cardelli and Shapiro). Any stochastic-join will probably benefit from
being able to model compartments similarly.</p>
<p>The simple, ambientless approach suffers some of the same problems
as SBML is facing with regard to recognising the similarity of species
across compartments. Some way of factoring out reactions/processes/etc
common across all compartments should be useful.</p>