eighty-twenty news
2024-01-24T14:37:46+00:00
http://eighty-twenty.org
tonyg
mikeb
More pitfalls regarding JavaScript's non-monadic promises
2024-01-24T14:09:24+00:00
http://eighty-twenty.org/2024/01/24/more-pitfalls-of-js-promises
tonyg
<p>As is
<a href="https://github.com/promises-aplus/promises-spec/issues/94#issuecomment-16176966">well-known</a>,
JavaScript’s <code class="language-plaintext highlighter-rouge">Promise</code> is not a monad. It will happily treat <code class="language-plaintext highlighter-rouge">Promise<Promise<T>></code> as if it was
<code class="language-plaintext highlighter-rouge">Promise<T></code>:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="o">></span> <span class="p">[</span><span class="mi">123</span><span class="p">,</span> <span class="k">await</span> <span class="nb">Promise</span><span class="p">.</span><span class="nx">resolve</span><span class="p">(</span><span class="mi">123</span><span class="p">),</span> <span class="k">await</span> <span class="nb">Promise</span><span class="p">.</span><span class="nx">resolve</span><span class="p">(</span><span class="nb">Promise</span><span class="p">.</span><span class="nx">resolve</span><span class="p">(</span><span class="mi">123</span><span class="p">))]</span>
<span class="p">[</span> <span class="mi">123</span><span class="p">,</span> <span class="mi">123</span><span class="p">,</span> <span class="mi">123</span> <span class="p">]</span></code></pre></figure>
<p>This can bite you in unexpected ways. Imagine you have a CSP-like <code class="language-plaintext highlighter-rouge">Channel<T></code> class for
sending <code class="language-plaintext highlighter-rouge">T</code>s back and forth. <code class="language-plaintext highlighter-rouge">Channel<T></code> might have a method like this:</p>
<figure class="highlight"><pre><code class="language-typescript" data-lang="typescript"><span class="k">async</span> <span class="nx">pop</span><span class="p">():</span> <span class="nb">Promise</span><span class="o"><</span><span class="nx">T</span> <span class="o">|</span> <span class="kc">undefined</span><span class="o">></span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span></code></pre></figure>
<p>There’s an obvious problem here: what if <code class="language-plaintext highlighter-rouge">undefined</code> ∈ <code class="language-plaintext highlighter-rouge">T</code>? So you make sure to note, in the
comment attached to <code class="language-plaintext highlighter-rouge">Channel<T></code>, that <code class="language-plaintext highlighter-rouge">T</code> is not allowed to include <code class="language-plaintext highlighter-rouge">undefined</code>.</p>
<p>But the less obvious problem is that <code class="language-plaintext highlighter-rouge">T</code> is not allowed to contain <code class="language-plaintext highlighter-rouge">Promise<undefined></code> either,
even though in other contexts a promise of undefined cannot be confused with undefined:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="o">></span> <span class="k">typeof</span> <span class="kc">undefined</span>
<span class="dl">'</span><span class="s1">undefined</span><span class="dl">'</span>
<span class="o">></span> <span class="k">typeof</span> <span class="nb">Promise</span><span class="p">.</span><span class="nx">resolve</span><span class="p">(</span><span class="kc">undefined</span><span class="p">)</span>
<span class="dl">'</span><span class="s1">object</span><span class="dl">'</span></code></pre></figure>
<p>To see why this is a problem, instantiate <code class="language-plaintext highlighter-rouge">T</code> with <code class="language-plaintext highlighter-rouge">Promise<undefined></code>, and look at the type
of <code class="language-plaintext highlighter-rouge">pop()</code>:</p>
<figure class="highlight"><pre><code class="language-typescript" data-lang="typescript"><span class="nb">Promise</span><span class="o"><</span><span class="nb">Promise</span><span class="o"><</span><span class="kc">undefined</span><span class="o">></span> <span class="o">|</span> <span class="kc">undefined</span><span class="o">></span></code></pre></figure>
<p>Because JavaScript collapses promises-of-promises to just promises, this is equivalent to just</p>
<figure class="highlight"><pre><code class="language-typescript" data-lang="typescript"><span class="nb">Promise</span><span class="o"><</span><span class="kc">undefined</span><span class="o">></span></code></pre></figure>
<p>and you’ve lost the ability to tell whether <code class="language-plaintext highlighter-rouge">pop()</code> yielded a <code class="language-plaintext highlighter-rouge">T</code> or an <code class="language-plaintext highlighter-rouge">undefined</code>.</p>
<p>TypeScript does not warn you about this, incidentally. (Ask me how I know.)</p>
<h2 id="workaround">Workaround</h2>
<p>Instead of accepting this loss of structure and adding another caveat to <code class="language-plaintext highlighter-rouge">Channel<T></code> to work
around JavaScript’s broken design—“<code class="language-plaintext highlighter-rouge">T</code> must not include either <code class="language-plaintext highlighter-rouge">undefined</code> or
<code class="language-plaintext highlighter-rouge">Promise<undefined></code> or <code class="language-plaintext highlighter-rouge">Promise<Promise<undefined>></code> etc.”—I decided to change the signature
of <code class="language-plaintext highlighter-rouge">pop()</code>:</p>
<figure class="highlight"><pre><code class="language-typescript" data-lang="typescript"><span class="k">async</span> <span class="nx">pop</span><span class="p">():</span> <span class="nb">Promise</span><span class="o"><</span><span class="nx">Maybe</span><span class="o"><</span><span class="nx">T</span><span class="o">>></span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
<span class="kd">type</span> <span class="nx">Maybe</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">=</span> <span class="nx">Just</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">|</span> <span class="kc">undefined</span><span class="p">;</span>
<span class="kd">type</span> <span class="nx">Just</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">=</span> <span class="p">{</span> <span class="na">item</span><span class="p">:</span> <span class="nx">T</span> <span class="p">};</span></code></pre></figure>
<p>Now both <code class="language-plaintext highlighter-rouge">Channel<undefined></code> and <code class="language-plaintext highlighter-rouge">Channel<Promise<undefined>></code> are sensible and work as
expected. No more exceptions regarding what <code class="language-plaintext highlighter-rouge">T</code>s a <code class="language-plaintext highlighter-rouge">Channel</code> may carry.</p>
<p>When <code class="language-plaintext highlighter-rouge">T</code> is <code class="language-plaintext highlighter-rouge">Promise<undefined></code>, in particular, we see that the type of <code class="language-plaintext highlighter-rouge">pop()</code> is</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Promise<{ item: Promise<undefined> } | undefined>
</code></pre></div></div>
<p>Because the <code class="language-plaintext highlighter-rouge">Promise</code>s aren’t immediately nested, JavaScript won’t erase our structure.</p>
<p>(Ironically, we’ve introduced a monad (<code class="language-plaintext highlighter-rouge">Maybe<T></code>) to fix the bad behaviour of something that
<em>should</em> have been a monad…)</p>
Python3 is removing crypt.crypt and not replacing it with anything ¯\_(ツ)_/¯
2024-01-13T15:53:14+00:00
http://eighty-twenty.org/2024/01/13/python-crypt-shacrypt
tonyg
<p>Python 3.13 will, for inscrutable reasons, remove the <code class="language-plaintext highlighter-rouge">crypt</code> module from
the standard library. The <a href="https://peps.python.org/pep-0594/#crypt">excuses given in PEP
0594</a> boil down to “here are some
good reasons why new code shouldn’t use this module.” What about existing
code? Ah well.</p>
<p>So anyway, for those of us who need some way of generating <code class="language-plaintext highlighter-rouge">$6$</code> SHAcrypt
SHA-512 shadow-password-database entries from Python, stick the following
module into your codebase (you can also download it, <a href="/files/shacrypt512.py">shacrypt512.py</a>) and replace code like</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">crypt</span><span class="p">.</span><span class="n">crypt</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="n">salt</span><span class="o">=</span><span class="n">crypt</span><span class="p">.</span><span class="n">METHOD_SHA512</span><span class="p">)</span>
<span class="n">crypt</span><span class="p">.</span><span class="n">crypt</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="s">'$6$salt$...'</span><span class="p">)</span>
<span class="n">crypt</span><span class="p">.</span><span class="n">crypt</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="s">'$6$salt$...'</span><span class="p">)</span> <span class="o">==</span> <span class="s">'$6$salt$...'</span></code></pre></figure>
<p>with</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">shacrypt512</span><span class="p">.</span><span class="n">shacrypt</span><span class="p">(</span><span class="n">password</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">))</span>
<span class="n">shacrypt512</span><span class="p">.</span><span class="n">shacrypt</span><span class="p">(</span><span class="n">password</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">),</span> <span class="sa">b</span><span class="s">'salt'</span><span class="p">)</span>
<span class="n">shacrypt512</span><span class="p">.</span><span class="n">password_ok</span><span class="p">(</span><span class="n">password</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">),</span> <span class="s">'$6$salt$...'</span><span class="p">)</span></code></pre></figure>
<p>respectively.</p>
<p>Without further ado, here’s <a href="/files/shacrypt512.py">shacrypt512.py</a>:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c1"># SHAcrypt using SHA-512, after https://akkadia.org/drepper/SHA-crypt.txt.
#
# Copyright © 2024 Tony Garnock-Jones.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
</span>
<span class="kn">import</span> <span class="nn">hashlib</span>
<span class="kn">import</span> <span class="nn">secrets</span>
<span class="n">alphabet</span> <span class="o">=</span> \
<span class="p">[</span><span class="nb">ord</span><span class="p">(</span><span class="n">c</span><span class="p">)</span> <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="s">'./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'</span><span class="p">]</span>
<span class="n">permutation</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="mi">42</span><span class="p">],</span> <span class="p">[</span><span class="mi">22</span><span class="p">,</span> <span class="mi">43</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="p">[</span><span class="mi">44</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">23</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">24</span><span class="p">,</span> <span class="mi">45</span><span class="p">],</span>
<span class="p">[</span><span class="mi">25</span><span class="p">,</span> <span class="mi">46</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span> <span class="p">[</span><span class="mi">47</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">26</span><span class="p">],</span> <span class="p">[</span><span class="mi">6</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">48</span><span class="p">],</span> <span class="p">[</span><span class="mi">28</span><span class="p">,</span> <span class="mi">49</span><span class="p">,</span> <span class="mi">7</span><span class="p">],</span>
<span class="p">[</span><span class="mi">50</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">29</span><span class="p">],</span> <span class="p">[</span><span class="mi">9</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">51</span><span class="p">],</span> <span class="p">[</span><span class="mi">31</span><span class="p">,</span> <span class="mi">52</span><span class="p">,</span> <span class="mi">10</span><span class="p">],</span> <span class="p">[</span><span class="mi">53</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">32</span><span class="p">],</span>
<span class="p">[</span><span class="mi">12</span><span class="p">,</span> <span class="mi">33</span><span class="p">,</span> <span class="mi">54</span><span class="p">],</span> <span class="p">[</span><span class="mi">34</span><span class="p">,</span> <span class="mi">55</span><span class="p">,</span> <span class="mi">13</span><span class="p">],</span> <span class="p">[</span><span class="mi">56</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="mi">35</span><span class="p">],</span> <span class="p">[</span><span class="mi">15</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">57</span><span class="p">],</span>
<span class="p">[</span><span class="mi">37</span><span class="p">,</span> <span class="mi">58</span><span class="p">,</span> <span class="mi">16</span><span class="p">],</span> <span class="p">[</span><span class="mi">59</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">38</span><span class="p">],</span> <span class="p">[</span><span class="mi">18</span><span class="p">,</span> <span class="mi">39</span><span class="p">,</span> <span class="mi">60</span><span class="p">],</span> <span class="p">[</span><span class="mi">40</span><span class="p">,</span> <span class="mi">61</span><span class="p">,</span> <span class="mi">19</span><span class="p">],</span>
<span class="p">[</span><span class="mi">62</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">41</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">63</span><span class="p">],</span>
<span class="p">]</span>
<span class="k">def</span> <span class="nf">encode</span><span class="p">(</span><span class="n">bs64</span><span class="p">):</span>
<span class="n">result</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="mi">4</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">permutation</span><span class="p">))</span>
<span class="n">i</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">group</span> <span class="ow">in</span> <span class="n">permutation</span><span class="p">:</span>
<span class="n">g</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">j</span><span class="p">:</span> <span class="n">bs64</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="k">if</span> <span class="n">j</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span> <span class="k">else</span> <span class="mi">0</span>
<span class="n">bits</span> <span class="o">=</span> <span class="n">g</span><span class="p">(</span><span class="n">group</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o"><<</span> <span class="mi">16</span> <span class="o">|</span> <span class="n">g</span><span class="p">(</span><span class="n">group</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="o"><<</span> <span class="mi">8</span> <span class="o">|</span> <span class="n">g</span><span class="p">(</span><span class="n">group</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
<span class="n">result</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">alphabet</span><span class="p">[</span><span class="n">bits</span> <span class="o">&</span> <span class="mi">63</span><span class="p">]</span>
<span class="n">result</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">alphabet</span><span class="p">[(</span><span class="n">bits</span> <span class="o">>></span> <span class="mi">6</span><span class="p">)</span> <span class="o">&</span> <span class="mi">63</span><span class="p">]</span>
<span class="n">result</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">alphabet</span><span class="p">[(</span><span class="n">bits</span> <span class="o">>></span> <span class="mi">12</span><span class="p">)</span> <span class="o">&</span> <span class="mi">63</span><span class="p">]</span>
<span class="n">result</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">alphabet</span><span class="p">[(</span><span class="n">bits</span> <span class="o">>></span> <span class="mi">18</span><span class="p">)</span> <span class="o">&</span> <span class="mi">63</span><span class="p">]</span>
<span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">4</span>
<span class="k">return</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">result</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)[:</span><span class="o">-</span><span class="mi">2</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">repeats_of</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">bs</span><span class="p">):</span> <span class="k">return</span> <span class="n">bs</span> <span class="o">*</span> <span class="nb">int</span><span class="p">(</span><span class="n">n</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">bs</span><span class="p">))</span> <span class="o">+</span> <span class="n">bs</span><span class="p">[:</span><span class="n">n</span> <span class="o">%</span> <span class="nb">len</span><span class="p">(</span><span class="n">bs</span><span class="p">)]</span>
<span class="k">def</span> <span class="nf">digest</span><span class="p">(</span><span class="n">bs</span><span class="p">):</span> <span class="k">return</span> <span class="n">hashlib</span><span class="p">.</span><span class="n">sha512</span><span class="p">(</span><span class="n">bs</span><span class="p">).</span><span class="n">digest</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">shacrypt</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="n">salt</span> <span class="o">=</span> <span class="bp">None</span><span class="p">,</span> <span class="n">rounds</span> <span class="o">=</span> <span class="mi">5000</span><span class="p">):</span>
<span class="k">if</span> <span class="n">salt</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span> <span class="n">salt</span> <span class="o">=</span> <span class="n">encode</span><span class="p">(</span><span class="n">secrets</span><span class="p">.</span><span class="n">token_bytes</span><span class="p">(</span><span class="mi">64</span><span class="p">))[:</span><span class="mi">16</span><span class="p">].</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="n">salt</span> <span class="o">=</span> <span class="n">salt</span><span class="p">[:</span><span class="mi">16</span><span class="p">]</span>
<span class="n">B</span> <span class="o">=</span> <span class="n">digest</span><span class="p">(</span><span class="n">password</span> <span class="o">+</span> <span class="n">salt</span> <span class="o">+</span> <span class="n">password</span><span class="p">)</span>
<span class="n">Ainput</span> <span class="o">=</span> <span class="n">password</span> <span class="o">+</span> <span class="n">salt</span> <span class="o">+</span> <span class="n">repeats_of</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">password</span><span class="p">),</span> <span class="n">B</span><span class="p">)</span>
<span class="n">v</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">password</span><span class="p">)</span>
<span class="k">while</span> <span class="n">v</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span>
<span class="n">Ainput</span> <span class="o">=</span> <span class="n">Ainput</span> <span class="o">+</span> <span class="p">(</span><span class="n">B</span> <span class="k">if</span> <span class="n">v</span> <span class="o">&</span> <span class="mi">1</span> <span class="k">else</span> <span class="n">password</span><span class="p">)</span>
<span class="n">v</span> <span class="o">=</span> <span class="n">v</span> <span class="o">>></span> <span class="mi">1</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">digest</span><span class="p">(</span><span class="n">Ainput</span><span class="p">)</span>
<span class="n">DP</span> <span class="o">=</span> <span class="n">digest</span><span class="p">(</span><span class="n">password</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">password</span><span class="p">))</span>
<span class="n">P</span> <span class="o">=</span> <span class="n">repeats_of</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">password</span><span class="p">),</span> <span class="n">DP</span><span class="p">)</span>
<span class="n">DS</span> <span class="o">=</span> <span class="n">digest</span><span class="p">(</span><span class="n">salt</span> <span class="o">*</span> <span class="p">(</span><span class="mi">16</span><span class="o">+</span><span class="n">A</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">repeats_of</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">salt</span><span class="p">),</span> <span class="n">DS</span><span class="p">)</span>
<span class="n">C</span> <span class="o">=</span> <span class="n">A</span>
<span class="k">for</span> <span class="nb">round</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">rounds</span><span class="p">):</span>
<span class="n">Cinput</span> <span class="o">=</span> <span class="sa">b</span><span class="s">''</span>
<span class="n">Cinput</span> <span class="o">=</span> <span class="n">Cinput</span> <span class="o">+</span> <span class="p">(</span><span class="n">P</span> <span class="k">if</span> <span class="nb">round</span> <span class="o">&</span> <span class="mi">1</span> <span class="k">else</span> <span class="n">C</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">round</span> <span class="o">%</span> <span class="mi">3</span><span class="p">:</span> <span class="n">Cinput</span> <span class="o">=</span> <span class="n">Cinput</span> <span class="o">+</span> <span class="n">S</span>
<span class="k">if</span> <span class="nb">round</span> <span class="o">%</span> <span class="mi">7</span><span class="p">:</span> <span class="n">Cinput</span> <span class="o">=</span> <span class="n">Cinput</span> <span class="o">+</span> <span class="n">P</span>
<span class="n">Cinput</span> <span class="o">=</span> <span class="n">Cinput</span> <span class="o">+</span> <span class="p">(</span><span class="n">C</span> <span class="k">if</span> <span class="nb">round</span> <span class="o">&</span> <span class="mi">1</span> <span class="k">else</span> <span class="n">P</span><span class="p">)</span>
<span class="n">C</span> <span class="o">=</span> <span class="n">digest</span><span class="p">(</span><span class="n">Cinput</span><span class="p">)</span>
<span class="k">if</span> <span class="n">rounds</span> <span class="o">==</span> <span class="mi">5000</span><span class="p">:</span>
<span class="k">return</span> <span class="s">'$6$'</span> <span class="o">+</span> <span class="n">salt</span><span class="p">.</span><span class="n">decode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span> <span class="o">+</span> <span class="s">'$'</span> <span class="o">+</span> <span class="n">encode</span><span class="p">(</span><span class="n">C</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="s">'$6$rounds='</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">rounds</span><span class="p">)</span> <span class="o">+</span> <span class="s">'$'</span> <span class="o">+</span> <span class="n">salt</span><span class="p">.</span><span class="n">decode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span> <span class="o">+</span> <span class="s">'$'</span> <span class="o">+</span> <span class="n">encode</span><span class="p">(</span><span class="n">C</span><span class="p">)</span>
<span class="c1">#---------------------------------------------------------------------------
</span>
<span class="k">def</span> <span class="nf">extract_salt_and_rounds</span><span class="p">(</span><span class="n">i</span><span class="p">):</span> <span class="c1"># i must be '$6$...'
</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">i</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">'$'</span><span class="p">)</span>
<span class="k">if</span> <span class="n">pieces</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">!=</span> <span class="s">'6'</span><span class="p">:</span> <span class="k">raise</span> <span class="nb">TypeError</span><span class="p">(</span><span class="s">'shacrypt512 only supports $6$ hashes'</span><span class="p">)</span>
<span class="k">if</span> <span class="n">pieces</span><span class="p">[</span><span class="mi">2</span><span class="p">].</span><span class="n">startswith</span><span class="p">(</span><span class="s">'rounds='</span><span class="p">):</span>
<span class="n">rounds</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">pieces</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">7</span><span class="p">:])</span>
<span class="k">if</span> <span class="n">rounds</span> <span class="o"><</span> <span class="mi">1000</span><span class="p">:</span> <span class="n">rounds</span> <span class="o">=</span> <span class="mi">1000</span>
<span class="k">if</span> <span class="n">rounds</span> <span class="o">></span> <span class="mi">999999999</span><span class="p">:</span> <span class="n">rounds</span> <span class="o">=</span> <span class="mi">999999999</span>
<span class="k">return</span> <span class="p">(</span><span class="n">pieces</span><span class="p">[</span><span class="mi">3</span><span class="p">].</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">),</span> <span class="n">rounds</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="p">(</span><span class="n">pieces</span><span class="p">[</span><span class="mi">2</span><span class="p">].</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">),</span> <span class="mi">5000</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">password_ok</span><span class="p">(</span><span class="n">input_password</span><span class="p">,</span> <span class="n">existing_crypted_password</span><span class="p">):</span>
<span class="p">(</span><span class="n">salt</span><span class="p">,</span> <span class="n">rounds</span><span class="p">)</span> <span class="o">=</span> <span class="n">extract_salt_and_rounds</span><span class="p">(</span><span class="n">existing_crypted_password</span><span class="p">)</span>
<span class="k">return</span> <span class="n">existing_crypted_password</span> <span class="o">==</span> <span class="n">shacrypt</span><span class="p">(</span><span class="n">input_password</span><span class="p">,</span> <span class="n">salt</span><span class="p">,</span> <span class="n">rounds</span><span class="p">)</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span>
<span class="n">_test_password</span> <span class="o">=</span> <span class="s">'Hello world!'</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="n">_test_salt</span> <span class="o">=</span> <span class="s">'saltstring'</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="n">_test_rounds</span> <span class="o">=</span> <span class="mi">5000</span>
<span class="n">_test_crypted_password</span> <span class="o">=</span> <span class="s">'$6$saltstring$svn8UoSVapNtMuq1ukKS4tPQd8iKwSMHWjl/O817G3uBnIFNjnQJuesI68u4OTLiBFdcbYEdFCoEOfaS35inz1'</span>
<span class="k">assert</span> <span class="n">shacrypt</span><span class="p">(</span><span class="n">_test_password</span><span class="p">,</span> <span class="n">_test_salt</span><span class="p">,</span> <span class="n">_test_rounds</span><span class="p">)</span> <span class="o">==</span> <span class="n">_test_crypted_password</span>
<span class="k">assert</span> <span class="n">password_ok</span><span class="p">(</span><span class="n">_test_password</span><span class="p">,</span> <span class="n">_test_crypted_password</span><span class="p">)</span>
<span class="n">_test_password</span> <span class="o">=</span> <span class="s">'Hello world!'</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="n">_test_salt</span> <span class="o">=</span> <span class="s">'saltstringsaltstring'</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="n">_test_rounds</span> <span class="o">=</span> <span class="mi">10000</span>
<span class="n">_test_crypted_password</span> <span class="o">=</span> <span class="s">'$6$rounds=10000$saltstringsaltst$OW1/O6BYHV6BcXZu8QVeXbDWra3Oeqh0sbHbbMCVNSnCM/UrjmM0Dp8vOuZeHBy/YTBmSK6H9qs/y3RnOaw5v.'</span>
<span class="k">assert</span> <span class="n">shacrypt</span><span class="p">(</span><span class="n">_test_password</span><span class="p">,</span> <span class="n">_test_salt</span><span class="p">,</span> <span class="n">_test_rounds</span><span class="p">)</span> <span class="o">==</span> <span class="n">_test_crypted_password</span>
<span class="k">assert</span> <span class="n">password_ok</span><span class="p">(</span><span class="n">_test_password</span><span class="p">,</span> <span class="n">_test_crypted_password</span><span class="p">)</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="n">salt</span> <span class="o">=</span> <span class="bp">None</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">)</span> <span class="o"><</span> <span class="mi">2</span> <span class="k">else</span> <span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="n">encode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">shacrypt</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">stdin</span><span class="p">.</span><span class="n">readline</span><span class="p">().</span><span class="n">strip</span><span class="p">().</span><span class="n">encode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">),</span> <span class="n">salt</span><span class="p">))</span></code></pre></figure>
Unmangling paths set by direnv on Windows 11
2024-01-04T10:25:05+00:00
http://eighty-twenty.org/2024/01/04/direnv-path-unmangling-on-windows
tonyg
<p><code class="language-plaintext highlighter-rouge">direnv</code> works fine on Windows 11, but if an <code class="language-plaintext highlighter-rouge">.envrc</code> tries to set the <code class="language-plaintext highlighter-rouge">PATH</code>, the result will
be a path in <em>Windows</em> format, not <em>Unix</em> format.<sup id="fnref:related-direnv-issues" role="doc-noteref"><a href="#fn:related-direnv-issues" class="footnote" rel="footnote">1</a></sup></p>
<p>Instead of adding <code class="language-plaintext highlighter-rouge">eval $(direnv hook bash)</code> to your <code class="language-plaintext highlighter-rouge">.bashrc</code>, try the following snippet:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nb">export </span><span class="nv">_unmangle_direnv_names</span><span class="o">=</span><span class="s1">'PATH'</span>
_unmangle_direnv_paths<span class="o">()</span> <span class="o">{</span>
<span class="k">for </span>k <span class="k">in</span> <span class="nv">$_unmangle_direnv_names</span><span class="p">;</span> <span class="k">do
</span><span class="nb">eval</span> <span class="s2">"</span><span class="nv">$k</span><span class="s2">=</span><span class="se">\"\$</span><span class="s2">(/usr/bin/cygpath -p </span><span class="se">\"\$</span><span class="nv">$k</span><span class="se">\"</span><span class="s2">)</span><span class="se">\"</span><span class="s2">"</span>
<span class="k">done</span>
<span class="o">}</span>
<span class="nb">eval</span> <span class="s2">"</span><span class="si">$(</span>direnv hook bash | <span class="nb">sed</span> <span class="nt">-e</span> <span class="s1">'s@export bash)@export bash)\
_unmangle_direnv_paths@'</span><span class="si">)</span><span class="s2">"</span></code></pre></figure>
<p>This modifies the output of <code class="language-plaintext highlighter-rouge">direnv hook bash</code> slightly, adding code to fix path-like variables
after <code class="language-plaintext highlighter-rouge">direnv</code> sets the environment up.<sup id="fnref:export-json" role="doc-noteref"><a href="#fn:export-json" class="footnote" rel="footnote">2</a></sup></p>
<p>The variable names to unmangle are drawn from a new variable, <code class="language-plaintext highlighter-rouge">_unmangle_direnv_names</code>,
initially set to <code class="language-plaintext highlighter-rouge">PATH</code>, which should contain a space-separated list of variable names.</p>
<p>If, in a particular <code class="language-plaintext highlighter-rouge">.envrc</code>, you need path-unmangling for an additional variable, you can add
that variable’s name to <code class="language-plaintext highlighter-rouge">_unmangle_direnv_names</code>. For example,</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">_unmangle_direnv_names</span><span class="o">=</span><span class="s2">"</span><span class="nv">$_unmangle_direnv_names</span><span class="s2"> XPATH"</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$PATH</span><span class="s2">:some_addition"</span>
<span class="nb">export </span><span class="nv">XPATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$PATH</span><span class="s2">:some_addition"</span></code></pre></figure>
<p>will unmangle both <code class="language-plaintext highlighter-rouge">PATH</code> and <code class="language-plaintext highlighter-rouge">XPATH</code>.</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:related-direnv-issues" role="doc-endnote">
<p>See <code class="language-plaintext highlighter-rouge">direnv</code> issues
<a href="https://github.com/direnv/direnv/issues/253">253</a> (“PATH gets mangled when using direnv
from git-bash on Windows”) and <a href="https://github.com/direnv/direnv/issues/796">796</a>
(“Incorrect path format is exported on Windows 10 with mintty / git bash, breaking the PATH
and command resolution”). <a href="#fnref:related-direnv-issues" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:export-json" role="doc-endnote">
<p>While experimenting, I discovered <code class="language-plaintext highlighter-rouge">direnv export json</code>! Very nice. It’s great
to see more and more tools using structured data for their inputs and outputs. <a href="#fnref:export-json" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
On the harm caused by missing basic (basic!) functionality in Signal, WhatsApp, Android and iOS
2023-12-17T00:33:38+00:00
http://eighty-twenty.org/2023/12/17/transferring-android-to-iphone
tonyg
<p>Trusting Signal and/or WhatsApp and/or Android (google) and/or iOS (apple) with your precious
photos, videos, and chats is a huge mistake.</p>
<p>All the photos, videos and chat history on my mother’s phone are now completely inaccessible to
us, despite having recent backups and all the necessary keys and passphrases.</p>
<h2 id="signal">Signal</h2>
<p>Can I transfer Signal backups directly from android to iphone? No.</p>
<p>Can I make a backup file and transfer that wirelessly without uploading it to the cloud? No.</p>
<p>Can I transfer it using a USB cable? Also no.</p>
<p>If I transfer it using the cloud, can I then restore from it on the iphone? … No.</p>
<p>It is not possible to transfer Signal chat history and media from an android phone to an iphone.</p>
<h2 id="whatsapp">WhatsApp</h2>
<p>What about Whatsapp?</p>
<p>Can I transfer Whatsapp backups directly from android to iphone? No.</p>
<p>Can I make a backup file and transfer that? No.</p>
<p>Can I back up to google drive and use that? No.</p>
<p>Even if I could get the backup file, would I be able to import it on the iphone? Also no.</p>
<p>It is not possible to transfer WhatsApp chat history and media from an android phone to an iphone.</p>
<h2 id="the-entire-ecosystem-is-sick">The entire ecosystem is sick</h2>
<p>Signal is to blame: they do not make it possible to import backup data on iphone. They do not
offer tools for working with backup data.</p>
<p>WhatsApp is to blame: they do not make it possible to retrieve or work with android backup
data. They do not make it possible to migrate data from android to ios without fully wiping and
resetting the iphone (and it didn’t work even when we did fully wipe it).</p>
<p>Apple is to blame: there is no way to transfer files from android without going through the
cloud. Even using a usb cable doesn’t work.</p>
<p>Google is to blame: there is no way to transfer files to iphone without going through the
cloud. There is no way to access a whatsapp backup blob in google drive containing my own data.</p>
<p>We all are to blame: we have accepted and continue to make excuses for an industry that acts in
such a user-hostile way.</p>
Simpler Preserves Binary Syntax
2023-10-15T13:21:47+00:00
http://eighty-twenty.org/2023/10/15/simpler-preserves-binary-syntax
tonyg
<p>I’ve just updated the <a href="https://preserves.dev/">Preserves</a> spec to version 0.990.0. I feel like
a 1.0-rc is approaching!</p>
<p>The main change since spec version 0.7.1 has been to simplify the binary syntax for Preserves
<a href="https://preserves.dev/preserves.html#semantics"><code class="language-plaintext highlighter-rouge">Value</code>s</a>:</p>
<ul>
<li>
<p>Both “short” and “medium” <code class="language-plaintext highlighter-rouge">SignedInteger</code> representations (starting with tags <code class="language-plaintext highlighter-rouge">0x9x</code>/<code class="language-plaintext highlighter-rouge">0xAx</code>)
were removed. They weren’t pulling their weight. Every <code class="language-plaintext highlighter-rouge">SignedInteger</code> now has tag <code class="language-plaintext highlighter-rouge">0xB0</code>.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">Float</code> and <code class="language-plaintext highlighter-rouge">Double</code> are now encoded with tag <code class="language-plaintext highlighter-rouge">0x87</code> and a length, rather than with fixed
tags <code class="language-plaintext highlighter-rouge">0x82</code> and <code class="language-plaintext highlighter-rouge">0x83</code>, opening the door to other IEEE754 formats in future.</p>
</li>
</ul>
<hr />
<p>Here’s the 0.990.0 syntax in “<a href="https://preserves.dev/cheatsheet.html">reference card</a>” format,
where we write <code class="language-plaintext highlighter-rouge">«V»</code> for the binary encoding of some value <code class="language-plaintext highlighter-rouge">V</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> «#f» = [0x80]
«#t» = [0x81]
«@W V» = [0x85] ++ «W» ++ «V»
«#!V» = [0x86] ++ «V»
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
«V» if V ∈ String = [0xB1] ++ varint(|utf8(V)|) ++ utf8(V)
«V» if V ∈ ByteString = [0xB2] ++ varint(|V|) ++ V
«V» if V ∈ Symbol = [0xB3] ++ varint(|utf8(V)|) ++ utf8(V)
«<L F_1...F_m>» = [0xB4] ++ «L» ++ «F_1» ++...++ «F_m» ++ [0x84]
«[X_1...X_m]» = [0xB5] ++ «X_1» ++...++ «X_m» ++ [0x84]
«#{E_1...E_m}» = [0xB6] ++ «E_1» ++...++ «E_m» ++ [0x84]
«{K_1:V_1...K_m:V_m}» = [0xB7] ++ «K_1» ++ «V_1» ++...++ «K_m» ++ «V_m» ++ [0x84]
varint(v) = [v] if v < 128
[(v & 0x7F) + 128] ++ varint(v >> 7) if v ≥ 128
</code></pre></div></div>
Joining Markdown tables
2023-09-06T21:20:18+00:00
http://eighty-twenty.org/2023/09/06/joining-markdown-tables
tonyg
<p>I’ve been working with Markdown tables a lot recently. The lovely
<a href="https://jblevins.org/projects/markdown-mode/">markdown-mode</a> for Emacs makes this easy and
pleasant. However, I find myself treating the tables a little like CSV, and started wanting to
run <em>joins</em> on tables often enough that I hacked together my first little Emacs package,
<a href="https://gist.github.com/tonyg/c951be93656b45027ed7d15e79f07c01"><code class="language-plaintext highlighter-rouge">markdown-join</code></a>.</p>
<p>Here’s a demo:</p>
<script async="" id="asciicast-606594" src="https://asciinema.org/a/606594.js"></script>
<p>The code is a single file,
<a href="https://gist.github.com/tonyg/c951be93656b45027ed7d15e79f07c01#file-markdown-join-el"><code class="language-plaintext highlighter-rouge">markdown-join.el</code></a>.
I’ve submitted it to MELPA.</p>
Virtualizing uxn
2023-08-11T16:07:24+00:00
http://eighty-twenty.org/2023/08/11/virtualizing-uxn
tonyg
<p><a href="https://100r.co/site/uxn.html">uxn</a> is a delightful virtual-machine <a href="https://wiki.xxiivv.com/site/uxntal.html">specification</a> (with lots of
<a href="https://github.com/hundredrabbits/awesome-uxn#emulators">implementations</a>) supporting 16-bit
operation.</p>
<p>At present, it is a “real mode” machine, and there is no support for running virtual uxn
machines under an uxn hypervisor.</p>
<p>It’d be neat to have a virtualizable uxn system: you could write a full operating system, with
memory protection and multi-tasking, entirely in uxntal, and existing ROMs wouldn’t even know
the difference! Or you could write a debugger that made good use of the native uxn core where
possible. You could serialize a program, suspending it to be resumed later, or transfer it
across the network to a different host, all without having to modify the base uxn emulator.</p>
<p>So: <strong>Can we make uxn virtualizable,</strong> so that an uxn hypervisor can manage one or more child
virtual machines, each running a program presented with a seamless illusion that it’s the only
software on the machine? Let’s find out!</p>
<h2 id="what-is-virtualizability">What is virtualizability?</h2>
<p>In 1974, Gerald J. Popek and Robert P. Goldberg wrote a <a href="https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualization_requirements">seminal paper</a> laying out a
simple and robust approach to determining whether a particular machine architecture is
virtualizable. Quoting from <a href="https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualization_requirements">the paper’s wikipage</a>,</p>
<blockquote>
<p>There are three properties of interest when analyzing the environment created by a VMM [a
Virtual Machine Monitor, a.k.a. a Hypervisor]:</p>
<p><strong>Equivalence / Fidelity.</strong> A program running under the VMM should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly.</p>
<p><strong>Resource control / Safety.</strong> The VMM must be in complete control of the virtualized resources.</p>
<p><strong>Efficiency / Performance.</strong> A statistically dominant fraction of machine instructions must be executed without VMM intervention.</p>
</blockquote>
<h2 id="is-real-mode-uxn-virtualizable">Is real-mode uxn virtualizable?</h2>
<p>Uxn as specified doesn’t have a supervisor-mode/user-mode split, so we will have to adapt Popek
and Goldberg’s development.</p>
<p>Sometimes the uxn machine takes a trap, but its behavior is defined to transition to a
<em>metamachine</em>, not to some supervisor mode. In case of errors, the metamachine will often
reactivate the machine at some error handler vector, but again, it’s in normal operation mode,
not some supervisor mode. This means that there are <em>no privileged instructions</em> in uxn.</p>
<p>There are no base-and-bounds registers, and the range of pointers exactly covers accessible RAM
(64k!), so there are no out-of-bounds RAM accesses possible. However, it is possible to
underflow or overflow the stack, which falls in the spirit of Popek and Goldberg’s
<code class="language-plaintext highlighter-rouge">memorytrap</code>. When this happens, the metamachine takes over. <code class="language-plaintext highlighter-rouge">BRK</code> and <code class="language-plaintext highlighter-rouge">DIV</code> are the only
instructions which trap or fault otherwise than <code class="language-plaintext highlighter-rouge">memorytrap</code>.</p>
<p>There is no way to change the amount of available memory, and no mode changes possible, so
there are no control sensitive instructions.</p>
<p>While there is no memory virtualization in uxn, consideration of behavior sensitive
instructions requires us to posit an address translation facility. But we can easily make sure
this doesn’t fall foul of Popek and Goldberg’s first (relocation) criterion of behavior
sensitivity; and as there are no modes, their second criterion fails also. So there are no
behavior sensitive instructions.</p>
<p>Thus, because all of the (nonexistent) control- and/or behavior-sensitive instructions fall in
the (empty) set of privileged instructions, we can rely on Popek and Goldberg’s theorem 1, and
say that <strong>construction of a VMM for uxn must be feasible.</strong> In fact, it’s almost there as it
stands.</p>
<p>Reaching out further, to recursive virtualizability, is in my opinion the simplest path to
virtualization. Execution of uxn is deterministic, given the devices in play today! Thus, by
Popek and Goldberg’s theorem 2, uxn must be able to be recursively virtualizable.</p>
<h2 id="virtualized-uxn">Virtualized uxn</h2>
<p>Let’s explore one possible approach to actually virtualizing uxn.</p>
<p>At the foundation of a virtualization tree lies a real, physical machine, with real RAM and
real I/O devices. That physical base machine has a physical RAM pointer. Every virtual machine
in the tree implicitly accesses the physical RAM.</p>
<p>Each virtual machine also has private registers, and private memory. Recall we’re considering
<em>recursive</em> virtualization, where the program a hypervisor supervises may in turn be a
hypervisor.</p>
<ul>
<li>Registers
<ul>
<li>32-bit base, relative to its parent’s memory allocation</li>
<li>32-bit bound (length of accessible memory)</li>
<li>16-bit PC</li>
<li>a trapCode register</li>
</ul>
</li>
<li>Memory, all of which MUST be disjoint from the allowed base-and-bound region of RAM
<ul>
<li>256 bytes of device memory</li>
<li>two 255-byte stacks, plus two 1-byte stack pointers</li>
<li>two 32-byte device mask bitsets and one 16-entry × 16-bits device version vector</li>
<li>a 16-byte trap description area</li>
</ul>
</li>
</ul>
<p>The base and bound are new. Machines do not have access to them; they are only able to access
the registers of “child” VMs. An exception is made to allow a VM to read its own bound
register. It does this via the system device “expansion” port, using an operation <code class="language-plaintext highlighter-rouge">getBound</code>,
described below.</p>
<p>The outermost VM has a base of 0 and a bound of however large the physical RAM is.</p>
<p>All memory-accessing operations (including loading an instruction) take the base and bound into
account. Accesses falling outside physical RAM or outside the base-and-bound cause a memory
fault.</p>
<p>The device mask bitsets still control device effects, but instead of calling a <code class="language-plaintext highlighter-rouge">DEI</code>/<code class="language-plaintext highlighter-rouge">DEO</code>
handler routine as a subroutine of the VM’s evaluator, the evaluator traps, yielding control to
the parent VM, which performs the requested action.</p>
<p>Every time a VM stops execution, its trapCode and trap description area are updated to describe
the cause of the stop.</p>
<h2 id="system-device-expansion-operations">System device expansion operations</h2>
<p>Adding the following operations is sufficient to allow writing a hypervisor in <a href="https://wiki.xxiivv.com/site/uxntal.html">uxntal</a>:</p>
<table>
<thead>
<tr>
<th>name</th>
<th>operation</th>
<th>fields</th>
</tr>
</thead>
<tbody>
<tr>
<td>getBound</td>
<td>#10</td>
<td>&boundHi $2 &boundLo $2</td>
</tr>
<tr>
<td>vmExec</td>
<td>#11</td>
<td>&vmAddr $2</td>
</tr>
</tbody>
</table>
<h3 id="getbound">getBound</h3>
<p>The handler retrieves the 32-bit active bound on memory access, overwriting the four bytes
following the operation code.</p>
<h3 id="vmexec">vmExec</h3>
<p>The handler expects a VM Control Block (VMCB) to exist at <code class="language-plaintext highlighter-rouge">vmAddr</code>:</p>
<table>
<thead>
<tr>
<th>field name</th>
<th>offset</th>
<th>size (bytes)</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>parentLink</td>
<td>0</td>
<td>4</td>
<td>scratch space for control transfer</td>
</tr>
<tr>
<td>baseHi</td>
<td>4</td>
<td>2</td>
<td>high short of base register</td>
</tr>
<tr>
<td>baseLo</td>
<td>6</td>
<td>2</td>
<td>low short of base register</td>
</tr>
<tr>
<td>boundHi</td>
<td>8</td>
<td>2</td>
<td>high short of bound register</td>
</tr>
<tr>
<td>boundLo</td>
<td>10</td>
<td>2</td>
<td>low short of bound register</td>
</tr>
<tr>
<td>pc</td>
<td>12</td>
<td>2</td>
<td>program counter</td>
</tr>
<tr>
<td>trapCode</td>
<td>14</td>
<td>2</td>
<td>trap code</td>
</tr>
<tr>
<td>trapDescription</td>
<td>16</td>
<td>16</td>
<td>trap detail record</td>
</tr>
<tr>
<td>deiMask</td>
<td>32</td>
<td>32</td>
<td>device input mask bitset</td>
</tr>
<tr>
<td>deoMask</td>
<td>64</td>
<td>32</td>
<td>device output mask bitset</td>
</tr>
<tr>
<td>devVers</td>
<td>96</td>
<td>32</td>
<td>device version vector</td>
</tr>
<tr>
<td>(reserved)</td>
<td>128</td>
<td>128</td>
<td>–</td>
</tr>
<tr>
<td>workStack</td>
<td>256</td>
<td>255</td>
<td>work stack base</td>
</tr>
<tr>
<td>workStackPtr</td>
<td>511</td>
<td>1</td>
<td>work stack pointer</td>
</tr>
<tr>
<td>returnStack</td>
<td>512</td>
<td>255</td>
<td>return stack base</td>
</tr>
<tr>
<td>returnStackPtr</td>
<td>767</td>
<td>1</td>
<td>return stack pointer</td>
</tr>
<tr>
<td>devMemory</td>
<td>768</td>
<td>256</td>
<td>device memory</td>
</tr>
<tr>
<td> </td>
<td>1024</td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<p>The operation performs a context switch to the described VM. In the following, <code class="language-plaintext highlighter-rouge">next</code> denotes
the VMCB pointed to by <code class="language-plaintext highlighter-rouge">vmAddr</code>, and <code class="language-plaintext highlighter-rouge">curr</code> the running VM’s VMCB.</p>
<p>The absolute base <code class="language-plaintext highlighter-rouge">absBase</code> is computed from the running VM’s base register and the base
register in <code class="language-plaintext highlighter-rouge">next</code>. If <code class="language-plaintext highlighter-rouge">absBase</code> or <code class="language-plaintext highlighter-rouge">absBase</code> plus the <code class="language-plaintext highlighter-rouge">next</code>’s bound lies outside the <code class="language-plaintext highlighter-rouge">curr</code>’s
base-and-bound, <code class="language-plaintext highlighter-rouge">curr</code> takes a memory fault. If any part of the <code class="language-plaintext highlighter-rouge">next</code> VMCB overlaps with the
region addressed by <code class="language-plaintext highlighter-rouge">absBase</code> and <code class="language-plaintext highlighter-rouge">next</code>’s bound, <code class="language-plaintext highlighter-rouge">curr</code> takes a memory fault.<sup id="fnref:vmcb-exclusion" role="doc-noteref"><a href="#fn:vmcb-exclusion" class="footnote" rel="footnote">1</a></sup></p>
<p>If these checks all pass, <code class="language-plaintext highlighter-rouge">curr</code>’s VMCB address is placed in <code class="language-plaintext highlighter-rouge">parentLink</code> (it will not be
visible to any VM; see below) and execution resumes in <code class="language-plaintext highlighter-rouge">next</code>.</p>
<h2 id="traps-and-faults">Traps and faults</h2>
<p>When a running VM takes a trap or a fault, it transfers control to its parent VM. The parent
VM’s address is retrieved from the <code class="language-plaintext highlighter-rouge">parentLink</code> field of the running VMCB (<code class="language-plaintext highlighter-rouge">curr</code>). If it is 0,
there is no parent, and this is the outermost VM; the pre-virtualization behavior applies.
Otherwise, <code class="language-plaintext highlighter-rouge">curr</code>’s <code class="language-plaintext highlighter-rouge">trapCode</code> and <code class="language-plaintext highlighter-rouge">trapDescription</code> are set appropriately, its <code class="language-plaintext highlighter-rouge">parentLink</code>
field is cleared (ensuring that the VM cannot learn anything about its absolute location), and
the parent VM is resumed.</p>
<h2 id="modified-instructions">Modified instructions</h2>
<p>The effect of <code class="language-plaintext highlighter-rouge">BRK</code> is altered: it takes a trap with <code class="language-plaintext highlighter-rouge">trapCode</code> of <code class="language-plaintext highlighter-rouge">BRK</code> and no description.</p>
<p>The effects of <code class="language-plaintext highlighter-rouge">DEI</code> and <code class="language-plaintext highlighter-rouge">DEO</code> are also altered. Instead of invoking a <code class="language-plaintext highlighter-rouge">dei</code>/<code class="language-plaintext highlighter-rouge">deo</code> function
inside the evaluator, the machine takes a trap with <code class="language-plaintext highlighter-rouge">trapCode</code> set to <code class="language-plaintext highlighter-rouge">IO</code>. The suspended VM’s
PC is left pointing <em>after</em> the <code class="language-plaintext highlighter-rouge">DEI</code>/<code class="language-plaintext highlighter-rouge">DEO</code> instruction; the arguments to <code class="language-plaintext highlighter-rouge">DEI</code>/<code class="language-plaintext highlighter-rouge">DEO</code> are
included in the <code class="language-plaintext highlighter-rouge">trapDescription</code>.</p>
<h2 id="memory-and-stack-faults">Memory and stack faults</h2>
<p>When an out-of-bounds access occurs, the stack underflows, or the stack overflows, the machine
takes a fault with an appropriate <code class="language-plaintext highlighter-rouge">trapCode</code>, and with <code class="language-plaintext highlighter-rouge">trapDescription</code> including details of
the fault. The machine’s PC and stack are left alone so that if the machine is immediately
restarted, it will precisely repeat the fault. This could be used to e.g. provide a (small)
virtual memory, or an (unbounded) virtual stack.</p>
<h2 id="implementation">Implementation</h2>
<p>I have implemented emulator support for virtualization, as sketched above,
<a href="https://gitlab.com/tonyg/uxn/-/tree/virtuxn3">here</a>. I’ve also written a simple <a href="https://gitlab.com/tonyg/uxn/-/blob/virtuxn3/xun.tal">hypervisor
called <code class="language-plaintext highlighter-rouge">xun</code></a> (like Xen, see?). To see
the code, check out the <code class="language-plaintext highlighter-rouge">virtuxn3</code> branch of the <code class="language-plaintext highlighter-rouge">https://gitlab.com/tonyg/uxn</code> git repository.</p>
<p>The implementation is done in two stages:</p>
<ul>
<li>
<p>the first stage (<a href="https://gitlab.com/tonyg/uxn/-/compare/virtuxn3-base...virtuxn3-novirt">diffs from <code class="language-plaintext highlighter-rouge">virtuxn3-base</code> to
<code class="language-plaintext highlighter-rouge">virtuxn3-novirt</code></a>)
splits out “MMU” and IO “processors” into a “system-on-chip”, <code class="language-plaintext highlighter-rouge">uxnsoc.[ch]</code>, updating the
various core implementations to use the slightly-different interface to memory and device
IO.</p>
</li>
<li>
<p>the second stage (<a href="https://gitlab.com/tonyg/uxn/-/compare/virtuxn3-novirt...virtuxn3">diffs from <code class="language-plaintext highlighter-rouge">virtuxn3-novirt</code> to
<code class="language-plaintext highlighter-rouge">virtuxn3</code></a>) is simpler,
providing an alternative compatible “system-on-chip” implementation supporting the new
system extension commands and providing the necessary machinery to swap in and out of nested
VMs as required.</p>
</li>
</ul>
<p>The <a href="https://gitlab.com/tonyg/uxn/-/blob/virtuxn3/xun.tal">hypervisor, <code class="language-plaintext highlighter-rouge">xun</code></a>, is a little
program that is currently hardcoded to load a particular ROM, <code class="language-plaintext highlighter-rouge">bin/xuninner.rom</code>. It cannot yet
recursively run itself running another program, and lacks configurability. It also doesn’t do
anything about device multiplexing, like a real multi-tasking operating system would have to
do. But the basics are there, enough to show it working: it can, for example, run <code class="language-plaintext highlighter-rouge">piano.rom</code>
without problems.</p>
<h2 id="differences-between-direct-and-virtualized-execution">Differences between direct and virtualized execution</h2>
<p>Because the devices take <em>virtual</em> addresses, rather than <em>physical</em> addresses, a supervisor
has to copy down media such as sprites and audio samples before asking the hardware to use
them. This leads to noticeable differences in behavior: running <code class="language-plaintext highlighter-rouge">piano.rom</code>, sounding a note,
and then scribbling on the waveform causes the playing note to change its timbre. When running
the same program under <code class="language-plaintext highlighter-rouge">xun</code>, however, the note stays the same; the edit to the timbre doesn’t
take effect until the next note is played.</p>
<p>One change that could ameliorate this would be to make the various addresses used in device
memory into 32-bit physical addresses (16 bit page / 16-bit offset). That way, translated
addresses from nested VMs could be used without having to copy them. (A second benefit would be
the ability for ROMs to use media outside the bottom 64k without having to copy them first.)</p>
<h2 id="next-steps">Next steps</h2>
<p>It’d be nice to adapt the devices to support 32-bit addresses for sprite and audio sample data
etc. It can be done backward-compatibly; I need to write down the approach I have in mind.
There’s also a bit of implementation work remaining in <code class="language-plaintext highlighter-rouge">xun</code> to allow full recursive
virtualization; it’s, in some sense, a Trivial Matter of Programming, but the details are
actually a little more fiddly than they have to be and perhaps some changes to the device
interfaces could make things easier for hypervisors without complicating life for simple
real-mode ROMs. There’s nascent support in baseline <code class="language-plaintext highlighter-rouge">uxn</code> for self-describing devices: I’d like
to see this fleshed out a bit so that nested VMs could reliably discover the features made
available to them by their hypervisor. Ultimately, it’d be amazing to build something a little
System-7 like using virtualized uxn for running programs. The nested VM interface could even be
widened to include a “fuel counter” of instructions before a “fuel exhausted” trap, allowing
preemptive multitasking.</p>
<h2 id="the-end">The end</h2>
<p>All in all, uxn virtualizes very nicely. Writing a hypervisor entirely in uxntal works well
with minimal extensions to the VM specification. I’m looking forward to seeing how things
develop!</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:vmcb-exclusion" role="doc-endnote">
<p>A corollary of this is that a nested VM’s addressable memory must always be
smaller (at least by the size of one VMCB) than its parent’s addressable memory. <a href="#fnref:vmcb-exclusion" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
File distribution over DNS: (ab)using DNS as a CDN
2023-07-31T16:32:06+00:00
http://eighty-twenty.org/2023/07/31/dns-as-a-cdn
tonyg
<p>This is the story of a one-afternoon hack that turned into a one-weekend hack.</p>
<p>I woke up on Saturday with a silly idea: what would it be like to use <a href="https://en.wikipedia.org/wiki/Chunking_(computing)#In_data_deduplication,_data_synchronization_and_remote_data_compression">content-defined chunking
(CDC)</a>
and serve the chunks over DNS? DNS caching could make scaleout and incremental updates quite
efficient, at least in theory. Strong hashing gives robust download integrity. DNS replication
gives you a kind of high availability, even!</p>
<p>After a coffee, I figured I may as well try it out.</p>
<p><strong>TL;DR.</strong> It works, more or less, so long as your resolver properly upgrades to DNS-over-TCP
when it gets a truncated UDP response. The immutable, strongly-named chunks are served in TXT
records (!) and are cached by resolvers for a long time. This lowers load on the authoritative
DNS server. Each stored “file” gets a user-friendly name for its root chunk in the form of a
CNAME with a much shorter TTL.</p>
<p>You can try it out with:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SEKIENAKASHITA.DEMO.NSCDN.ORG. <span class="o">></span> SekienAkashita.jpg</code></pre></figure>
<p>which downloads <a href="https://en.wikipedia.org/wiki/Akashita#/media/File:SekienAkashita.jpg">this
image</a> using nothing but
DNS.</p>
<p>A demo server is running on the domain <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code> and the code is available at
<a href="https://gitlab.com/tonyg/nscdn">https://gitlab.com/tonyg/nscdn</a>.</p>
<h2 id="how-it-works">How it works</h2>
<p>The <a href="https://github.com/ronomon/deduplication/blob/master/README.md">Ronomon variant</a> of
<a href="https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf">FastCDC</a>, due to
<a href="https://github.com/jorangreef">Joran Greef</a>, is a JavaScript-friendly, 32-bit content-defined
chunking algorithm that splits big files into chunks with a distribution of sizes having an
average, minimum and maximum length.<sup id="fnref:distribution-is-weird" role="doc-noteref"><a href="#fn:distribution-is-weird" class="footnote" rel="footnote">1</a></sup></p>
<p>For this project I chose an 8k minimum size, a 16k average, and a 48k upper limit, because of
the limitations involved in serving large amounts of data via DNS.</p>
<p>The core idea is to use the CDC algorithm to slice up a data file, and then construct a broad,
shallow <a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle tree</a> from it, serving each leaf and
inner node of the tree as a separate DNS TXT record associated with a domain label including
the Base32 encoding of the <a href="https://www.blake2.net/">BLAKE2b</a> hash of the data.</p>
<p>The example file from above, <code class="language-plaintext highlighter-rouge">SekienAkashita.jpg</code>, is split up into five chunks and one inner
node that lists the chunks:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SEKIENAKASHITA.DEMO.NSCDN.ORG (CNAME)
└─⯈ 2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ.DEMO.NSCDN.ORG (TXT, 320 bytes)
├─⯈ 1-3NZRAZ4RTLWEFFSE27EBEVNKFOM6SFI4ISQEH7NGNPCG2CS4SIUA.··· (TXT, 22366 bytes)
├─⯈ 1-ZYNVPQYLBJYT7EOWRUDNY6XSURS3KWOMTQUODKUCCWXXQLKZFF3Q.··· (TXT, 8282 bytes)
├─⯈ 1-VPJYS4PUGX275YDFEKZF6BC3SXYZDMNTANV5LXMDWMB2PDSKWJPQ.··· (TXT, 16303 bytes)
├─⯈ 1-IBGDKGR2IXRIISKASJIP5CLPLHCUSGE5V6SVRWKHFHJSIAZVXOHQ.··· (TXT, 18696 bytes)
└─⯈ 1-QT7EHDMMEVKJF77MOL4T4PXU3FGSCBXRVNFMYJK4NOQ4BJ6I7YCA.··· (TXT, 43819 bytes)
</code></pre></div></div>
<p>For a larger file—say, the Linux kernel—the index node itself
would be longer than permitted for a chunk, so it would be split up itself and another level
would be added to the tree to index the chunks of the lower-level node. This recursion can be
repeated as required. I chose a 64-bit file size limit.</p>
<h2 id="storing-a-tree-in-dns">Storing a tree in DNS</h2>
<p>A CNAME record from the human-usable name of each file (<code class="language-plaintext highlighter-rouge">SEKIENAKASHITA.DEMO.NSCDN.ORG</code>) points
to the DNS record of the root node of the file’s Merkle tree.</p>
<p>Each tree node is stored as raw binary (!) in a DNS TXT record. Each record’s DNS label is
formed from the type code of the node and the base32 encoding of the BLAKE2b hash of the binary
content.</p>
<p>The content of leaf nodes (type 1) is just the binary data associated with the chunk. Each
inner node (type 2) is a sequence of 64-byte “pointer” records containing a binary form of
child nodes’ DNS labels, along with a length for each child. This allows random-access
retrieval of subranges of each file.</p>
<!--
Here's the 320-byte content of the node labeled
`2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ` above, the root index node of our
`SekienAkashita.jpg` example:
```
0000000000000001000000000000575E00000000000000000000000000000000
DB731067919AEC429644D7C81255AA2B99E9151C44A043FDA66BC46D0A5C9228
0000000000000001000000000000205A00000000000000000000000000000000
CE1B57C30B0A713F91D68D06DC7AF2A465B559CC9C28E1AA8215AF782D592977
00000000000000010000000000003FAF00000000000000000000000000000000
ABD38971F435F5FEE06522B25F045B95F191B1B3036BD5DD83B303A78E4AB25F
0000000000000001000000000000490800000000000000000000000000000000
404C351A3A45E28449409250FE896F59C549189DAFA558D94729D3240335BB8F
0000000000000001000000000000AB2B00000000000000000000000000000000
84FE438D8C255492FFEC72F93E3EF4D94D2106F1AB4ACC255C6BA1C0A7C8FE04
```
-->
<p>Weirdly, this is all completely standards-compliant use of DNS. The only place I’m pushing the
limits is the large-ish TXT records, effectively mandating use of DNS’s fallback TCP mode.</p>
<h2 id="serving-the-data-stick-it-in-sqlite-tiny-server-job-done">Serving the data: stick it in SQLite, tiny server, job done</h2>
<p>Any ordinary DNS server can be used to serve the records from a domain under one’s control.</p>
<p>However, I decided to hack together my own little one that could serve records straight out of
a SQLite database. I used it as an excuse to experiment with Golang again for the first time in
more than a decade.<sup id="fnref:verdict-on-go" role="doc-noteref"><a href="#fn:verdict-on-go" class="footnote" rel="footnote">2</a></sup></p>
<p>I cobbled together a program called
<a href="https://gitlab.com/tonyg/nscdn/-/blob/main/cmd/nscdnd/nscdnd.go"><code class="language-plaintext highlighter-rouge">nscdnd</code></a> which uses</p>
<ul>
<li><a href="https://github.com/miekg/dns">github.com/miekg/dns</a> as a DNS codec and server,</li>
<li><a href="https://github.com/mattn/go-sqlite3">github.com/mattn/go-sqlite3</a> to access the SQLite database, and</li>
<li><a href="https://github.com/sirupsen/logrus">github.com/sirupsen/logrus</a> for logging.</li>
</ul>
<p>Similarly, a little tool called
<a href="https://gitlab.com/tonyg/nscdn/-/blob/main/cmd/nscdn/nscdn.go"><code class="language-plaintext highlighter-rouge">nscdn</code></a> allows insertion
(<code class="language-plaintext highlighter-rouge">nscdn add</code>) and deletion (<code class="language-plaintext highlighter-rouge">nscdn del</code>) of files from a database, plus retrieval and
reassembly of a file from a given domain name (<code class="language-plaintext highlighter-rouge">nscdn get</code>).</p>
<p>The bulk of the interesting code is in</p>
<ul>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/hashtree/hashtree.go">leastfixedpoint.com/nscdn/pkg/hashtree</a>
(Merkle tree),</li>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/chunking/ronomon/ronomon.go">leastfixedpoint.com/nscdn/pkg/chunking/ronomon</a>,
(CDC chunking), and</li>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/nscdn/nscdn.go">leastfixedpoint.com/nscdn/pkg/nscdn</a>,
(retrieval, integrity-verification, and reassembly of records from DNS).</li>
</ul>
<h2 id="okay-but-is-this-a-good-idea">Okay, but is this a good idea?</h2>
<p>¯\_(ツ)_/¯ It works surprisingly well in the limited testing I’ve done. I doubt it’s the
most efficient way to transfer files, but it’s not wildly unreasonable. The idea of getting the
chunks cached by caching resolvers between clients and the authoritative server seems to work
well: when I re-download something, it only hits the authoritative server for the short-lived
CNAME and the root-node TXT record. The other nodes in the tree seem to be cached somewhat
locally to my client.</p>
<h2 id="try-it-out-yourself">Try it out yourself!</h2>
<p>You can retrieve files from the demo server on <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code>, as previously mentioned:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SEKIENAKASHITA.DEMO.NSCDN.ORG. <span class="o">></span> SekienAkashita.jpg</code></pre></figure>
<p>You can also run an <code class="language-plaintext highlighter-rouge">nscdnd</code> instance for a domain you control. All the following examples use
docker, but you can just check out <a href="https://gitlab.com/tonyg/nscdn">the repository</a> and build
it yourself too.</p>
<p>To run a server:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-d</span> <span class="nt">-p</span> 53:53 <span class="nt">-p</span> 53:53/udp <span class="se">\</span>
<span class="nt">-v</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/store.sqlite3:/data/store.sqlite3 <span class="se">\</span>
<span class="nt">--env</span> <span class="nv">NSCDN_ROOT</span><span class="o">=</span>your.domain.example.com <span class="se">\</span>
<span class="nt">--name</span> nscdnd <span class="se">\</span>
leastfixedpoint/nscdn</code></pre></figure>
<p>and add files to the store:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> <span class="se">\</span>
<span class="nt">-v</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/store.sqlite3:/data/store.sqlite3 <span class="se">\</span>
leastfixedpoint/nscdn <span class="se">\</span>
nscdn add /data/store.sqlite3 SOMEFILENAME < SomeFilename.bin</code></pre></figure>
<p>Then add an <code class="language-plaintext highlighter-rouge">NS</code> record pointing to it:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>your.domain.example.com. 86400 IN NS your.nscdnd.server.example.com.
</code></pre></div></div>
<p>and retrieve your files:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SOMEFILENAME.your.domain.example.com</code></pre></figure>
<p>You can also <code class="language-plaintext highlighter-rouge">dig +short -t txt your.domain.example.com</code> to get information about the running
server.</p>
<p>For the <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code> server, it looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dig +short -t txt demo.nscdn.org
"server=nscdnd" "version=v0.3.1" "SPDX-License-Identifier=AGPL-3.0-or-later" "source=https://gitlab.com/tonyg/nscdn"
</code></pre></div></div>
<p>Finally, you can use <code class="language-plaintext highlighter-rouge">dig</code> to retrieve CNAME and TXT records without using the <code class="language-plaintext highlighter-rouge">nscdn</code> tool:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dig +short -t any SEKIENAKASHITA.DEMO.NSCDN.ORG
2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ.DEMO.NSCDN.ORG.
"\000\000\000\000\000\000\000\001\000\000\000\000\000\000W^\000\000\000...
$ dig +short -t txt 1-ZYNVPQYLBJYT7EOWRUDNY6XSURS3KWOMTQUODKUCCWXXQLKZFF3Q.DEMO.NSCDN.ORG.
"\143q\254\239\254\247\253\211\131\199\152t\205b\238\207J\190\212\159INP...
</code></pre></div></div>
<p>(Dig presents the TXT records using the slightly peculiar decimal escaping syntax from <a href="https://datatracker.ietf.org/doc/html/rfc1035#section-5.1">RFC
1035</a>.)</p>
<h2 id="future-directions">Future directions</h2>
<p><strong>Compression of individual chunks.</strong> At the moment, chunks are served uncompressed. It’d be a
friendly thing to do to compress the contents of each TXT record.</p>
<p><strong>Experimenting with chunk sizes.</strong> Is the distribution of chunk sizes with the current
parameters reasonable? Are smaller chunks required, operationally, given we’re kind of pushing
DNS to its limits here? Could larger chunk sizes work?</p>
<p><strong>How does it perform?</strong> How could performance be improved?</p>
<p><strong>Garbage-collection of chunks in a store.</strong> At present, running <code class="language-plaintext highlighter-rouge">nscdn del</code> just removes the
CNAME link to a root chunk. It doesn’t traverse the graph to remove unreferenced chunks from
the store.</p>
<p><strong>Incremental downloads, partial downloads, recovery/repair of files.</strong></p>
<p><strong>Statistics on sharing of chunks in a store.</strong> Say you used a store to distribute multiple
releases of a piece of software. It’d be interesting to know how much sharing of chunks exists
among the different release files.</p>
<p><strong>Download and assemble files in the browser?</strong> So DNS-over-HTTPS is a thing. What happens if
we use, for example, the <a href="https://developers.cloudflare.com/1.1.1.1/encryption/dns-over-https/make-api-requests/dns-json/">Cloudflare DoH
API</a>
to retrieve our chunks?</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span>
<span class="dl">"</span><span class="s2">https://cloudflare-dns.com/dns-query?name=demo.nscdn.org&type=TXT</span><span class="dl">"</span><span class="p">,</span>
<span class="p">{</span> <span class="na">headers</span><span class="p">:</span> <span class="p">{</span> <span class="dl">"</span><span class="s2">accept</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">application/dns-json</span><span class="dl">"</span> <span class="p">}</span> <span class="p">})).</span><span class="nx">json</span><span class="p">(),</span> <span class="kc">null</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
<span class="err">⟶</span> <span class="p">{</span>
<span class="dl">"</span><span class="s2">Status</span><span class="dl">"</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">TC</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="dl">"</span><span class="s2">RD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="dl">"</span><span class="s2">RA</span><span class="dl">"</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="dl">"</span><span class="s2">AD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="dl">"</span><span class="s2">CD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">Question</span><span class="dl">"</span><span class="p">:</span> <span class="p">[{</span>
<span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">demo.nscdn.org</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="mi">16</span>
<span class="p">}],</span>
<span class="dl">"</span><span class="s2">Answer</span><span class="dl">"</span><span class="p">:</span> <span class="p">[{</span>
<span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">demo.nscdn.org</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="mi">16</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">TTL</span><span class="dl">"</span><span class="p">:</span> <span class="mi">284</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">data</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="se">\"</span><span class="s2">server=nscdnd</span><span class="se">\"\"</span><span class="s2">version=v0.3.1</span><span class="se">\"\"</span><span class="s2">SPDX-License-Identifier=AGPL-3.0-or-later</span><span class="se">\"\"</span><span class="s2">source=https://gitlab.com/tonyg/nscdn</span><span class="se">\"</span><span class="dl">"</span>
<span class="p">}]</span>
<span class="p">}</span></code></pre></figure>
<p>Hmm. Promising. Unfortunately, it doesn’t seem to like the TXT records having binary data in
them (bug? certainly TXT records are allowed to hold binary data, see
<a href="https://datatracker.ietf.org/doc/html/rfc1035#section-3.3.14">here</a> and then
<a href="https://datatracker.ietf.org/doc/html/rfc1035#section-3.3">here</a>), so it might not Just Work.
Perhaps a little redesign to use Base64 in the TXT record bodies is required; or perhaps
the <code class="language-plaintext highlighter-rouge">application/dns-message</code> binary-format API would work.</p>
<p><strong>Encryption of stored files.</strong> Perhaps something simple like
<a href="https://github.com/FiloSottile/age">age</a>.</p>
<p><strong>Insertion of data across the network.</strong> While <a href="https://en.wikipedia.org/wiki/Dynamic_DNS">Dynamic
DNS</a> is a thing, it’s not quite suitable for this
purpose. Perhaps some way of inserting or deleting files other than the current
ssh-to-the-server-and-run-a-command would be nice.</p>
<p><strong>Content-defined chunking of index nodes.</strong> For this quick prototype, I split non-leaf nodes
in a tree into 512-pointer chunks, but it’d be nicer to split them using some variation on the
CDC algorithm already used to split the raw binary data in the file. That way, incremental
updates would be more resilient to insertions and deletions.</p>
<p><strong>Think more about Named Data Networking.</strong> Remember <a href="https://named-data.net/">Named Data Networking
(NDN)</a>? It’s an alternative Internet architecture, initially kicked
off by Van Jacobson and colleagues. (It used to be known as <a href="https://en.wikipedia.org/wiki/Content_centric_networking">CCN, Content-Centric
Networking</a>.) This DNS hack is a
cheap-and-nasty system that has quite a bit in common with the design ideas of NDN, though
obviously it’s desperately unsophisticated by comparison.</p>
<h2 id="the-end">The End</h2>
<p>Anyway, I had a lot of fun hacking this together and relearning Go, though I was a little
embarrassed when I found myself spending a lot of time at the beginning browsing for a domain
to buy to show it off…</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:distribution-is-weird" role="doc-endnote">
<p>The distribution of chunk sizes is pretty strange though. I don’t
know in what sense the “average” actually <em>is</em> an average; the minimum and maximum cut-offs
are enforced by the algorithm though. For more detail, see <a href="https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf">the FastCDC
paper</a>. <a href="#fnref:distribution-is-weird" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:verdict-on-go" role="doc-endnote">
<p>So I have, uh, <em>criticisms</em> of go. But it’s nothing that hasn’t been said
before. Let’s just say that the experience reminded me of programming JavaScript in the bad
old days before TC39 really got going. Overall the experience was “ok, I guess”. <a href="#fnref:verdict-on-go" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Preserves Python Documentation
2023-03-17T14:31:45+00:00
http://eighty-twenty.org/2023/03/17/preserves-python-documentation
tonyg
<p>I’ve just spent a few hours writing documentation for the <a href="https://pypi.org/project/preserves/">Python <code class="language-plaintext highlighter-rouge">preserves</code>
package</a>, which implements
<a href="https://preserves.dev/">Preserves</a> for Python 3.x.</p>
<p>You can find the latest version of the docs at <a href="https://preserves.dev/python/latest/">https://preserves.dev/python/latest/</a>, or click
the screenshot below:</p>
<p><a href="https://preserves.dev/python/latest/"><img src="/images/Screenshot 2023-03-17 at 15-29-11 Python Preserves.png" alt="Screenshot of the new Python Preserves documentation" /></a></p>
SirTunnel, a personal ngrok alternative
2023-01-27T10:35:44+00:00
http://eighty-twenty.org/2023/01/27/sirtunnel-personal-ngrok
tonyg
<p>Happy New Year!</p>
<p>From time to time I need to expose a development web site or web service to the world. In the
past, I’ve used <a href="https://ngrok.com/">ngrok</a> for that, and of course long ago I built
<a href="http://reversehttp.net/">ReverseHTTP</a> which is somewhere in the same ballpark, but I recently
got fed up with the state of affairs and decided to see whether there was something simple I
could run myself to do the job.</p>
<p>I found Anders Pitman’s <a href="https://github.com/anderspitman/SirTunnel">SirTunnel</a>:</p>
<blockquote>
<p>Minimal, self-hosted, 0-config alternative to ngrok. Caddy+OpenSSH+50 lines of Python.</p>
</blockquote>
<p>It really is desperately simple. A beautiful bit of engineering. At its heart, it scripts
<a href="https://caddyserver.com/">Caddy</a>’s API to add and remove tunnels on the fly. When you SSH into
your server, you invoke the script, and for the duration of the SSH connection, a subdomain of
your server’s domain forwards traffic across the SSH link.</p>
<p>I’ve <a href="https://github.com/tonyg/SirTunnel">forked</a> the code for myself. So far, I haven’t
changed much: the script cleans up stale registrations at startup, as well as at exit, in case
a previous connection was interrupted somehow; and I’ve added support for forwarding to local
TLS services, with optional “insecure-mode” for avoiding certificate identity checks.</p>
<p>To get it running on a VM in the cloud, install Caddy (there’s a
<a href="https://packages.debian.org/sid/caddy"><code class="language-plaintext highlighter-rouge">caddy</code></a> package for Debian bookworm and sid), then
<em>disable</em> the systemd <code class="language-plaintext highlighter-rouge">caddy</code> service and <em>enable</em> the <code class="language-plaintext highlighter-rouge">caddy-api</code> service:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">apt <span class="nb">install </span>caddy
systemctl disable caddy
systemctl <span class="nb">enable </span>caddy-api
systemctl stop caddy
systemctl start caddy-api</code></pre></figure>
<p>Set up a wildcard DNS record for your server - something like <code class="language-plaintext highlighter-rouge">*.demo.example.com</code>. Each tunnel
will be made available on a subdomain of <code class="language-plaintext highlighter-rouge">demo.example.com</code>.</p>
<p>Then use the API to upload a simple “global” config. Here’s mine:</p>
<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
</span><span class="nl">"apps"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"http"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"servers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"default"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"logs"</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span><span class="w">
</span><span class="nl">"listen"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">":443"</span><span class="p">],</span><span class="w">
</span><span class="nl">"routes"</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>Upload it by putting it in a file <code class="language-plaintext highlighter-rouge">caddy_global.json</code> and run</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">curl <span class="nt">-L</span> localhost:2019/load <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="nt">-d</span> @caddy_global.json</code></pre></figure>
<p>Then, make sure SirTunnel’s <code class="language-plaintext highlighter-rouge">sirtunnel.py</code> script is available somewhere on the server to your
SSH user account.</p>
<p>At that point, to expose a local development service running on port 8443 to the world:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">ssh <span class="nt">-t</span> <span class="nt">-R</span> 8443:localhost:8443 YOURSERVER path/to/sirtunnel.py YOURAPP.demo.example.com 8443</code></pre></figure>
<p>I wrapped that up in a tiny script so that I didn’t have to remember the details of that
incantation, but it’s simple enough that you could easily just type it in the terminal each
time.</p>
<p>Many thanks to Anders Pitman for a really nice piece of software!</p>