Hot code reloading in Erlang without using an OTP release
Wed 2 Oct 2024 20:12 CEST
“Erlang supports change of code in a running system.”
However, the details are a bit fiddly. Here’s a cheat-sheet I used recently for a simple TCP service written using Erlang.
My program was a single module, running outside of any OTP application
context. The
instructions here need minor emendation to either explicitly list modules to purge and reload
or to discover all modules within a single application
; see the places in server-reload
below mentioning the atom my_server
.
I did not use the -on_load()
directive, because I wanted to be able to use multiple nodes
rather than controlling reloads from a single node’s shell repl, and I couldn’t figure out how
to make the two play nicely together.
The Erlang
I exported a code_change/0
from my module, to be called after loading a new version of the
module into a node. It sends a message code_change
to each “global” actor in my program (in
this case, there was only one).
-export([code_change/0]).
code_change() ->
io:format("+ code_change~n"),
%% name registered previously with `global:register_name/2`:
global:send(name_of_my_global_actor, code_change),
ok.
That actor distributes the notification on to any inferior actors it is managing, and then does an “MFA” self-call to upgrade its own codebase.
index(Connected) ->
receive
code_change ->
[P ! code_change || {_Peer, P} <- Connected],
?MODULE:index(Connected);
...
end.
Similarly, all other notified actors perform “MFA” self-calls.
connection(Sock, Username, IndexPid) ->
receive
code_change ->
?MODULE:connection(Sock, Username, IndexPid);
...
end.
Actors need to take care to manage upgrades of their state at the same time as they do the “MFA” self-calls.
Starting the program
I wanted it to be run by daemontools, so created the
following shell script called run
, which daemontools will pick up to start a service:
#!/bin/sh
set -e
erlc -o ebin my_server.erl
exec erl \
-noshell \
-pa ebin \
-sname mainnode \
-setcookie f98b3a1e-80ec-11ef-b752-0b638e4de31c \
-s my_server
Pick a fresh random cookie for the -setcookie
argument. I used uuid(1)
.
Then, I created this script, server-reload
:
#!/bin/sh
set -e
erlc -o ebin my_server.erl
exec erl \
-noshell \
-pa ebin \
-setcookie f98b3a1e-80ec-11ef-b752-0b638e4de31c \
-sname undefined \
-eval "
ServerNode = mainnode@$(hostname -s),
io:format(\"ServerNode: ~p~n\", [ServerNode]),
true = net_kernel:connect_node(ServerNode),
spawn(ServerNode, fun () ->
code:purge(my_server),
code:load_file(my_server),
ok = my_server:code_change()
end),
init:stop()"
Running server-reload
causes the source code to be compiled and hot-loaded into the running
server.
Grace notes
Then, I used a git post-receive
hook to automatically recompile and reload the code on push to live:
#!/bin/sh
set -e
unset GIT_DIR
cd $HOME/location-of-checkout-of-server-repository
git pull --ff-only
./server-reload
That’s it
That’s all. The end result worked well: I used it to run a hotfix to my TCP service with many tens of live, active connections, and not one of them noticed a thing.