eighty-twenty news feed for tag "distsys"2024-01-24T14:37:46+00:00http://eighty-twenty.org/tag/distsys/index.atomtonygmikebFile distribution over DNS: (ab)using DNS as a CDN2023-07-31T16:32:06+00:00http://eighty-twenty.org/2023/07/31/dns-as-a-cdntonyg<p>This is the story of a one-afternoon hack that turned into a one-weekend hack.</p>
<p>I woke up on Saturday with a silly idea: what would it be like to use <a href="https://en.wikipedia.org/wiki/Chunking_(computing)#In_data_deduplication,_data_synchronization_and_remote_data_compression">content-defined chunking
(CDC)</a>
and serve the chunks over DNS? DNS caching could make scaleout and incremental updates quite
efficient, at least in theory. Strong hashing gives robust download integrity. DNS replication
gives you a kind of high availability, even!</p>
<p>After a coffee, I figured I may as well try it out.</p>
<p><strong>TL;DR.</strong> It works, more or less, so long as your resolver properly upgrades to DNS-over-TCP
when it gets a truncated UDP response. The immutable, strongly-named chunks are served in TXT
records (!) and are cached by resolvers for a long time. This lowers load on the authoritative
DNS server. Each stored “file” gets a user-friendly name for its root chunk in the form of a
CNAME with a much shorter TTL.</p>
<p>You can try it out with:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SEKIENAKASHITA.DEMO.NSCDN.ORG. <span class="o">></span> SekienAkashita.jpg</code></pre></figure>
<p>which downloads <a href="https://en.wikipedia.org/wiki/Akashita#/media/File:SekienAkashita.jpg">this
image</a> using nothing but
DNS.</p>
<p>A demo server is running on the domain <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code> and the code is available at
<a href="https://gitlab.com/tonyg/nscdn">https://gitlab.com/tonyg/nscdn</a>.</p>
<h2 id="how-it-works">How it works</h2>
<p>The <a href="https://github.com/ronomon/deduplication/blob/master/README.md">Ronomon variant</a> of
<a href="https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf">FastCDC</a>, due to
<a href="https://github.com/jorangreef">Joran Greef</a>, is a JavaScript-friendly, 32-bit content-defined
chunking algorithm that splits big files into chunks with a distribution of sizes having an
average, minimum and maximum length.<sup id="fnref:distribution-is-weird" role="doc-noteref"><a href="#fn:distribution-is-weird" class="footnote" rel="footnote">1</a></sup></p>
<p>For this project I chose an 8k minimum size, a 16k average, and a 48k upper limit, because of
the limitations involved in serving large amounts of data via DNS.</p>
<p>The core idea is to use the CDC algorithm to slice up a data file, and then construct a broad,
shallow <a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle tree</a> from it, serving each leaf and
inner node of the tree as a separate DNS TXT record associated with a domain label including
the Base32 encoding of the <a href="https://www.blake2.net/">BLAKE2b</a> hash of the data.</p>
<p>The example file from above, <code class="language-plaintext highlighter-rouge">SekienAkashita.jpg</code>, is split up into five chunks and one inner
node that lists the chunks:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SEKIENAKASHITA.DEMO.NSCDN.ORG (CNAME)
└─⯈ 2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ.DEMO.NSCDN.ORG (TXT, 320 bytes)
├─⯈ 1-3NZRAZ4RTLWEFFSE27EBEVNKFOM6SFI4ISQEH7NGNPCG2CS4SIUA.··· (TXT, 22366 bytes)
├─⯈ 1-ZYNVPQYLBJYT7EOWRUDNY6XSURS3KWOMTQUODKUCCWXXQLKZFF3Q.··· (TXT, 8282 bytes)
├─⯈ 1-VPJYS4PUGX275YDFEKZF6BC3SXYZDMNTANV5LXMDWMB2PDSKWJPQ.··· (TXT, 16303 bytes)
├─⯈ 1-IBGDKGR2IXRIISKASJIP5CLPLHCUSGE5V6SVRWKHFHJSIAZVXOHQ.··· (TXT, 18696 bytes)
└─⯈ 1-QT7EHDMMEVKJF77MOL4T4PXU3FGSCBXRVNFMYJK4NOQ4BJ6I7YCA.··· (TXT, 43819 bytes)
</code></pre></div></div>
<p>For a larger file—say, the Linux kernel—the index node itself
would be longer than permitted for a chunk, so it would be split up itself and another level
would be added to the tree to index the chunks of the lower-level node. This recursion can be
repeated as required. I chose a 64-bit file size limit.</p>
<h2 id="storing-a-tree-in-dns">Storing a tree in DNS</h2>
<p>A CNAME record from the human-usable name of each file (<code class="language-plaintext highlighter-rouge">SEKIENAKASHITA.DEMO.NSCDN.ORG</code>) points
to the DNS record of the root node of the file’s Merkle tree.</p>
<p>Each tree node is stored as raw binary (!) in a DNS TXT record. Each record’s DNS label is
formed from the type code of the node and the base32 encoding of the BLAKE2b hash of the binary
content.</p>
<p>The content of leaf nodes (type 1) is just the binary data associated with the chunk. Each
inner node (type 2) is a sequence of 64-byte “pointer” records containing a binary form of
child nodes’ DNS labels, along with a length for each child. This allows random-access
retrieval of subranges of each file.</p>
<!--
Here's the 320-byte content of the node labeled
`2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ` above, the root index node of our
`SekienAkashita.jpg` example:
```
0000000000000001000000000000575E00000000000000000000000000000000
DB731067919AEC429644D7C81255AA2B99E9151C44A043FDA66BC46D0A5C9228
0000000000000001000000000000205A00000000000000000000000000000000
CE1B57C30B0A713F91D68D06DC7AF2A465B559CC9C28E1AA8215AF782D592977
00000000000000010000000000003FAF00000000000000000000000000000000
ABD38971F435F5FEE06522B25F045B95F191B1B3036BD5DD83B303A78E4AB25F
0000000000000001000000000000490800000000000000000000000000000000
404C351A3A45E28449409250FE896F59C549189DAFA558D94729D3240335BB8F
0000000000000001000000000000AB2B00000000000000000000000000000000
84FE438D8C255492FFEC72F93E3EF4D94D2106F1AB4ACC255C6BA1C0A7C8FE04
```
-->
<p>Weirdly, this is all completely standards-compliant use of DNS. The only place I’m pushing the
limits is the large-ish TXT records, effectively mandating use of DNS’s fallback TCP mode.</p>
<h2 id="serving-the-data-stick-it-in-sqlite-tiny-server-job-done">Serving the data: stick it in SQLite, tiny server, job done</h2>
<p>Any ordinary DNS server can be used to serve the records from a domain under one’s control.</p>
<p>However, I decided to hack together my own little one that could serve records straight out of
a SQLite database. I used it as an excuse to experiment with Golang again for the first time in
more than a decade.<sup id="fnref:verdict-on-go" role="doc-noteref"><a href="#fn:verdict-on-go" class="footnote" rel="footnote">2</a></sup></p>
<p>I cobbled together a program called
<a href="https://gitlab.com/tonyg/nscdn/-/blob/main/cmd/nscdnd/nscdnd.go"><code class="language-plaintext highlighter-rouge">nscdnd</code></a> which uses</p>
<ul>
<li><a href="https://github.com/miekg/dns">github.com/miekg/dns</a> as a DNS codec and server,</li>
<li><a href="https://github.com/mattn/go-sqlite3">github.com/mattn/go-sqlite3</a> to access the SQLite database, and</li>
<li><a href="https://github.com/sirupsen/logrus">github.com/sirupsen/logrus</a> for logging.</li>
</ul>
<p>Similarly, a little tool called
<a href="https://gitlab.com/tonyg/nscdn/-/blob/main/cmd/nscdn/nscdn.go"><code class="language-plaintext highlighter-rouge">nscdn</code></a> allows insertion
(<code class="language-plaintext highlighter-rouge">nscdn add</code>) and deletion (<code class="language-plaintext highlighter-rouge">nscdn del</code>) of files from a database, plus retrieval and
reassembly of a file from a given domain name (<code class="language-plaintext highlighter-rouge">nscdn get</code>).</p>
<p>The bulk of the interesting code is in</p>
<ul>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/hashtree/hashtree.go">leastfixedpoint.com/nscdn/pkg/hashtree</a>
(Merkle tree),</li>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/chunking/ronomon/ronomon.go">leastfixedpoint.com/nscdn/pkg/chunking/ronomon</a>,
(CDC chunking), and</li>
<li><a href="https://gitlab.com/tonyg/nscdn/-/blob/main/pkg/nscdn/nscdn.go">leastfixedpoint.com/nscdn/pkg/nscdn</a>,
(retrieval, integrity-verification, and reassembly of records from DNS).</li>
</ul>
<h2 id="okay-but-is-this-a-good-idea">Okay, but is this a good idea?</h2>
<p>¯\_(ツ)_/¯ It works surprisingly well in the limited testing I’ve done. I doubt it’s the
most efficient way to transfer files, but it’s not wildly unreasonable. The idea of getting the
chunks cached by caching resolvers between clients and the authoritative server seems to work
well: when I re-download something, it only hits the authoritative server for the short-lived
CNAME and the root-node TXT record. The other nodes in the tree seem to be cached somewhat
locally to my client.</p>
<h2 id="try-it-out-yourself">Try it out yourself!</h2>
<p>You can retrieve files from the demo server on <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code>, as previously mentioned:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SEKIENAKASHITA.DEMO.NSCDN.ORG. <span class="o">></span> SekienAkashita.jpg</code></pre></figure>
<p>You can also run an <code class="language-plaintext highlighter-rouge">nscdnd</code> instance for a domain you control. All the following examples use
docker, but you can just check out <a href="https://gitlab.com/tonyg/nscdn">the repository</a> and build
it yourself too.</p>
<p>To run a server:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-d</span> <span class="nt">-p</span> 53:53 <span class="nt">-p</span> 53:53/udp <span class="se">\</span>
<span class="nt">-v</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/store.sqlite3:/data/store.sqlite3 <span class="se">\</span>
<span class="nt">--env</span> <span class="nv">NSCDN_ROOT</span><span class="o">=</span>your.domain.example.com <span class="se">\</span>
<span class="nt">--name</span> nscdnd <span class="se">\</span>
leastfixedpoint/nscdn</code></pre></figure>
<p>and add files to the store:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> <span class="se">\</span>
<span class="nt">-v</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/store.sqlite3:/data/store.sqlite3 <span class="se">\</span>
leastfixedpoint/nscdn <span class="se">\</span>
nscdn add /data/store.sqlite3 SOMEFILENAME < SomeFilename.bin</code></pre></figure>
<p>Then add an <code class="language-plaintext highlighter-rouge">NS</code> record pointing to it:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>your.domain.example.com. 86400 IN NS your.nscdnd.server.example.com.
</code></pre></div></div>
<p>and retrieve your files:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">docker run <span class="nt">-i</span> <span class="nt">--rm</span> leastfixedpoint/nscdn <span class="se">\</span>
nscdn get SOMEFILENAME.your.domain.example.com</code></pre></figure>
<p>You can also <code class="language-plaintext highlighter-rouge">dig +short -t txt your.domain.example.com</code> to get information about the running
server.</p>
<p>For the <code class="language-plaintext highlighter-rouge">demo.nscdn.org</code> server, it looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dig +short -t txt demo.nscdn.org
"server=nscdnd" "version=v0.3.1" "SPDX-License-Identifier=AGPL-3.0-or-later" "source=https://gitlab.com/tonyg/nscdn"
</code></pre></div></div>
<p>Finally, you can use <code class="language-plaintext highlighter-rouge">dig</code> to retrieve CNAME and TXT records without using the <code class="language-plaintext highlighter-rouge">nscdn</code> tool:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dig +short -t any SEKIENAKASHITA.DEMO.NSCDN.ORG
2-NPNQZ4OQGZCXTLGFT6SGXO5TFLVZWMCFOXAV6GEYNDPYL67QSMPQ.DEMO.NSCDN.ORG.
"\000\000\000\000\000\000\000\001\000\000\000\000\000\000W^\000\000\000...
$ dig +short -t txt 1-ZYNVPQYLBJYT7EOWRUDNY6XSURS3KWOMTQUODKUCCWXXQLKZFF3Q.DEMO.NSCDN.ORG.
"\143q\254\239\254\247\253\211\131\199\152t\205b\238\207J\190\212\159INP...
</code></pre></div></div>
<p>(Dig presents the TXT records using the slightly peculiar decimal escaping syntax from <a href="https://datatracker.ietf.org/doc/html/rfc1035#section-5.1">RFC
1035</a>.)</p>
<h2 id="future-directions">Future directions</h2>
<p><strong>Compression of individual chunks.</strong> At the moment, chunks are served uncompressed. It’d be a
friendly thing to do to compress the contents of each TXT record.</p>
<p><strong>Experimenting with chunk sizes.</strong> Is the distribution of chunk sizes with the current
parameters reasonable? Are smaller chunks required, operationally, given we’re kind of pushing
DNS to its limits here? Could larger chunk sizes work?</p>
<p><strong>How does it perform?</strong> How could performance be improved?</p>
<p><strong>Garbage-collection of chunks in a store.</strong> At present, running <code class="language-plaintext highlighter-rouge">nscdn del</code> just removes the
CNAME link to a root chunk. It doesn’t traverse the graph to remove unreferenced chunks from
the store.</p>
<p><strong>Incremental downloads, partial downloads, recovery/repair of files.</strong></p>
<p><strong>Statistics on sharing of chunks in a store.</strong> Say you used a store to distribute multiple
releases of a piece of software. It’d be interesting to know how much sharing of chunks exists
among the different release files.</p>
<p><strong>Download and assemble files in the browser?</strong> So DNS-over-HTTPS is a thing. What happens if
we use, for example, the <a href="https://developers.cloudflare.com/1.1.1.1/encryption/dns-over-https/make-api-requests/dns-json/">Cloudflare DoH
API</a>
to retrieve our chunks?</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span>
<span class="dl">"</span><span class="s2">https://cloudflare-dns.com/dns-query?name=demo.nscdn.org&type=TXT</span><span class="dl">"</span><span class="p">,</span>
<span class="p">{</span> <span class="na">headers</span><span class="p">:</span> <span class="p">{</span> <span class="dl">"</span><span class="s2">accept</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">application/dns-json</span><span class="dl">"</span> <span class="p">}</span> <span class="p">})).</span><span class="nx">json</span><span class="p">(),</span> <span class="kc">null</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
<span class="err">⟶</span> <span class="p">{</span>
<span class="dl">"</span><span class="s2">Status</span><span class="dl">"</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">TC</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="dl">"</span><span class="s2">RD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="dl">"</span><span class="s2">RA</span><span class="dl">"</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="dl">"</span><span class="s2">AD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="dl">"</span><span class="s2">CD</span><span class="dl">"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">Question</span><span class="dl">"</span><span class="p">:</span> <span class="p">[{</span>
<span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">demo.nscdn.org</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="mi">16</span>
<span class="p">}],</span>
<span class="dl">"</span><span class="s2">Answer</span><span class="dl">"</span><span class="p">:</span> <span class="p">[{</span>
<span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">demo.nscdn.org</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="mi">16</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">TTL</span><span class="dl">"</span><span class="p">:</span> <span class="mi">284</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">data</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="se">\"</span><span class="s2">server=nscdnd</span><span class="se">\"\"</span><span class="s2">version=v0.3.1</span><span class="se">\"\"</span><span class="s2">SPDX-License-Identifier=AGPL-3.0-or-later</span><span class="se">\"\"</span><span class="s2">source=https://gitlab.com/tonyg/nscdn</span><span class="se">\"</span><span class="dl">"</span>
<span class="p">}]</span>
<span class="p">}</span></code></pre></figure>
<p>Hmm. Promising. Unfortunately, it doesn’t seem to like the TXT records having binary data in
them (bug? certainly TXT records are allowed to hold binary data, see
<a href="https://datatracker.ietf.org/doc/html/rfc1035#section-3.3.14">here</a> and then
<a href="https://datatracker.ietf.org/doc/html/rfc1035#section-3.3">here</a>), so it might not Just Work.
Perhaps a little redesign to use Base64 in the TXT record bodies is required; or perhaps
the <code class="language-plaintext highlighter-rouge">application/dns-message</code> binary-format API would work.</p>
<p><strong>Encryption of stored files.</strong> Perhaps something simple like
<a href="https://github.com/FiloSottile/age">age</a>.</p>
<p><strong>Insertion of data across the network.</strong> While <a href="https://en.wikipedia.org/wiki/Dynamic_DNS">Dynamic
DNS</a> is a thing, it’s not quite suitable for this
purpose. Perhaps some way of inserting or deleting files other than the current
ssh-to-the-server-and-run-a-command would be nice.</p>
<p><strong>Content-defined chunking of index nodes.</strong> For this quick prototype, I split non-leaf nodes
in a tree into 512-pointer chunks, but it’d be nicer to split them using some variation on the
CDC algorithm already used to split the raw binary data in the file. That way, incremental
updates would be more resilient to insertions and deletions.</p>
<p><strong>Think more about Named Data Networking.</strong> Remember <a href="https://named-data.net/">Named Data Networking
(NDN)</a>? It’s an alternative Internet architecture, initially kicked
off by Van Jacobson and colleagues. (It used to be known as <a href="https://en.wikipedia.org/wiki/Content_centric_networking">CCN, Content-Centric
Networking</a>.) This DNS hack is a
cheap-and-nasty system that has quite a bit in common with the design ideas of NDN, though
obviously it’s desperately unsophisticated by comparison.</p>
<h2 id="the-end">The End</h2>
<p>Anyway, I had a lot of fun hacking this together and relearning Go, though I was a little
embarrassed when I found myself spending a lot of time at the beginning browsing for a domain
to buy to show it off…</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:distribution-is-weird" role="doc-endnote">
<p>The distribution of chunk sizes is pretty strange though. I don’t
know in what sense the “average” actually <em>is</em> an average; the minimum and maximum cut-offs
are enforced by the algorithm though. For more detail, see <a href="https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf">the FastCDC
paper</a>. <a href="#fnref:distribution-is-weird" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:verdict-on-go" role="doc-endnote">
<p>So I have, uh, <em>criticisms</em> of go. But it’s nothing that hasn’t been said
before. Let’s just say that the experience reminded me of programming JavaScript in the bad
old days before TC39 really got going. Overall the experience was “ok, I guess”. <a href="#fnref:verdict-on-go" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
SirTunnel, a personal ngrok alternative2023-01-27T10:35:44+00:00http://eighty-twenty.org/2023/01/27/sirtunnel-personal-ngroktonyg<p>Happy New Year!</p>
<p>From time to time I need to expose a development web site or web service to the world. In the
past, I’ve used <a href="https://ngrok.com/">ngrok</a> for that, and of course long ago I built
<a href="http://reversehttp.net/">ReverseHTTP</a> which is somewhere in the same ballpark, but I recently
got fed up with the state of affairs and decided to see whether there was something simple I
could run myself to do the job.</p>
<p>I found Anders Pitman’s <a href="https://github.com/anderspitman/SirTunnel">SirTunnel</a>:</p>
<blockquote>
<p>Minimal, self-hosted, 0-config alternative to ngrok. Caddy+OpenSSH+50 lines of Python.</p>
</blockquote>
<p>It really is desperately simple. A beautiful bit of engineering. At its heart, it scripts
<a href="https://caddyserver.com/">Caddy</a>’s API to add and remove tunnels on the fly. When you SSH into
your server, you invoke the script, and for the duration of the SSH connection, a subdomain of
your server’s domain forwards traffic across the SSH link.</p>
<p>I’ve <a href="https://github.com/tonyg/SirTunnel">forked</a> the code for myself. So far, I haven’t
changed much: the script cleans up stale registrations at startup, as well as at exit, in case
a previous connection was interrupted somehow; and I’ve added support for forwarding to local
TLS services, with optional “insecure-mode” for avoiding certificate identity checks.</p>
<p>To get it running on a VM in the cloud, install Caddy (there’s a
<a href="https://packages.debian.org/sid/caddy"><code class="language-plaintext highlighter-rouge">caddy</code></a> package for Debian bookworm and sid), then
<em>disable</em> the systemd <code class="language-plaintext highlighter-rouge">caddy</code> service and <em>enable</em> the <code class="language-plaintext highlighter-rouge">caddy-api</code> service:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">apt <span class="nb">install </span>caddy
systemctl disable caddy
systemctl <span class="nb">enable </span>caddy-api
systemctl stop caddy
systemctl start caddy-api</code></pre></figure>
<p>Set up a wildcard DNS record for your server - something like <code class="language-plaintext highlighter-rouge">*.demo.example.com</code>. Each tunnel
will be made available on a subdomain of <code class="language-plaintext highlighter-rouge">demo.example.com</code>.</p>
<p>Then use the API to upload a simple “global” config. Here’s mine:</p>
<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
</span><span class="nl">"apps"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"http"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"servers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"default"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"logs"</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span><span class="w">
</span><span class="nl">"listen"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">":443"</span><span class="p">],</span><span class="w">
</span><span class="nl">"routes"</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>Upload it by putting it in a file <code class="language-plaintext highlighter-rouge">caddy_global.json</code> and run</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">curl <span class="nt">-L</span> localhost:2019/load <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="nt">-d</span> @caddy_global.json</code></pre></figure>
<p>Then, make sure SirTunnel’s <code class="language-plaintext highlighter-rouge">sirtunnel.py</code> script is available somewhere on the server to your
SSH user account.</p>
<p>At that point, to expose a local development service running on port 8443 to the world:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">ssh <span class="nt">-t</span> <span class="nt">-R</span> 8443:localhost:8443 YOURSERVER path/to/sirtunnel.py YOURAPP.demo.example.com 8443</code></pre></figure>
<p>I wrapped that up in a tiny script so that I didn’t have to remember the details of that
incantation, but it’s simple enough that you could easily just type it in the terminal each
time.</p>
<p>Many thanks to Anders Pitman for a really nice piece of software!</p>
Lying to TCP makes it a best-effort streaming protocol2018-02-01T11:51:18+00:00http://eighty-twenty.org/2018/02/01/lying-to-tcptonyg<p>In
<a href="http://allthingslinguistic.com/post/170321247615">this All Things Linguistic post</a>
Gretchen and Lauren introduce a model dialogue:</p>
<blockquote>
<p>G: “Would you like some coffee?”<br />
L: “Coffee would keep me awake.”</p>
</blockquote>
<p>This might mean Lauren wants coffee—or that she doesn’t want coffee!
Her reply on its own is ambiguous. It must be interpreted in a wider
context. The <a href="http://allthingslinguistic.com/post/170321247615">post</a> digs into the dialogue more
deeply.</p>
<p>In linguistics, the
<a href="https://en.wikipedia.org/wiki/Cooperative_principle">Cooperative Principle</a>
can be used to set up a frame within which utterances like these can
be interpreted.</p>
<p><strong>Network protocols are just the same.</strong> Utterances—packets sent
along the wire—must be interpreted using a similar <em>assumption of
cooperation</em>. And, just as in human conversation, an utterance can
have a deeper, more ambiguous meaning than it seems on its face.</p>
<h3 id="acknowledgements-in-tcp">Acknowledgements in TCP</h3>
<p>In TCP, the “ack” field contains a sequence number that is supposed to
be used by a receiving peer to signal the sending peer that bytes
up-to-but-not-including that sequence number have been safely received.</p>
<p>If it’s used that way, it makes TCP a reliable, in-order protocol.</p>
<h3 id="lying-to-the-sending-tcp">Lying to the sending TCP</h3>
<p>But imagine a receiver that didn’t very much care about reliability
beyond that offered by the underlying transport. For example, a VOIP
receiver, where delayed or lost voice packets are useless, and in fact
harmful if carried over TCP because they delay packets travelling
behind them, causing the whole conversation to get more and more
delayed.</p>
<p>That receiver could <em>lie</em>. It could use the “ack” sequence number field
to indicate which bytes it <em>no longer cared about</em>.</p>
<p>In effect, it would ignore lost packets, and use the “ack” field to
tell the sender not to bother retransmitting them.<sup id="fnref:fiddly-details" role="doc-noteref"><a href="#fn:fiddly-details" class="footnote" rel="footnote">1</a></sup></p>
<p>That decision would make TCP a best-effort, in-order protocol.</p>
<h3 id="is-it-really-a-lie">Is it really a lie?</h3>
<p>Seen one way, this abuse of the “ack” field isn’t a lie. In both cases,
the receiver doesn’t care about the bytes anymore: on the one hand,
because they’ve been received successfully; on the other, because
they’re irrelevant now.</p>
<p>It’s similar to the ambiguity about the coffee in the example above.</p>
<p>The surface meaning is clear. The deeper meaning depends on knowledge
about the wider context that isn’t carried in the utterance itself.</p>
<h3 id="we-can-assume-cooperation-without-assuming-motivation">We can assume cooperation without assuming motivation</h3>
<p>The statement “Don’t send me bytes numbered lower than X” can be
interpreted either as “I have received all bytes numbered lower than X”
or “I no longer care about bytes numbered lower than X”. Only context
can tell them apart.</p>
<p>.</p>
<p>.</p>
<hr />
<p>PS. My dissertation proposes <a href="http://syndicate-lang.org/">Syndicate</a>,
a design for new programming language features for concurrency.
Syndicate incorporates ideas about conversation and cooperativity like
those discussed here. (See, in particular,
<a href="http://syndicate-lang.org/tonyg-dissertation/html/#CHAP:PHILOSOPHY-AND-OVERVIEW">chapter 2</a>
of the dissertation.) It’s still early days, but I’ve put up a
resource page at <a href="http://syndicate-lang.org/tonyg-dissertation/">http://syndicate-lang.org/tonyg-dissertation/</a> that
has the dissertation itself, a video of the defense talk I gave, and a
few other resources.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:fiddly-details" role="doc-endnote">
<p>Of course, I’m ignoring fiddly details like
segment boundaries, interactions between this strategy and flow-
and congestion-control, and so on. It’d take a lot of work to
really do this! <a href="#fnref:fiddly-details" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
gRPC.io is interestingly different from CORBA2015-08-28T13:37:57+00:00http://eighty-twenty.org/2015/08/28/grpc-dot-iotonyg<p><a href="http://www.grpc.io/">gRPC</a> looks very interesting. From a quick
browse of the site, it looks like it differs from CORBA primarily in
that</p>
<ul>
<li>It is first-order.</li>
<li>It eschews exceptions.</li>
<li>It supports streaming requests and/or responses.</li>
</ul>
<p>(That’s setting aside differences between protobufs and GIOP.)</p>
<p>It’s the first point that I think is likely to be the big win. Much of
the complexity I saw with CORBA was to do with trying to pass object
(i.e. service endpoint) references back and forth in a transparent way.
Drop that misfeature, and everything from the IDL to the protocol to the
frameworks to the error handling to the implementations of services
themselves will be much simpler.</p>
<p>The way streaming is integrated is interesting too. There’s a clear
separation between (finite) data, including lists/arrays, in the
protobuf message-definition language, and (possibly non-finite) behavior
in the gRPC service-definition language. Streams, being coinductive, fit
naturally in the service-definition part.</p>