Updated aliases by bettio · Pull Request #2326 · atomvm/AtomVM

bettio · 2026-06-02T12:10:31Z

This revamps PR #2027

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

petermm · 2026-06-08T07:27:05Z

Fresh review of everything, usual caveats, assume we need sapiens from here, but very close to ready, issues seems very minor and nitpicky:

PR Review — Process aliases (OTP ≥ 26 semantics)

Scope: last 27 commits (dceedea65..f6af02332), adding erlang:alias/0,
erlang:unalias/1, erlang:monitor/3 with {alias, Mode}, spawn_opt
{monitor, MonitorOpts}, sending to an alias reference, and a new
process reference boxed term shape.

Reviewed: full diff plus mailbox.c, context.c, term.c, term.h,
external_term.c, nifs.c, globalcontext.c, scheduler.c,
opcodesswitch.h, jit.c. Findings cross-checked against the actual tree and
against the pre-PR baseline (HEAD~27).

Overall this is a careful, well-commented implementation with good test
coverage. The reference-term representation, 32-bit padding, static asserts,
and the alias-via-signal design (validating the alias in the owner's own
context, never from the sender) are sound. The issues below are worth
addressing; only #1 and #2 affect correctness in normal use.

Severity summary

#	Severity	Area	Status
1	Major	`spawn_opt` silently ignores `{monitor, BadTerm}`	New in PR
2	Major	Alias message vs. earlier `DOWN` signal ordering	New in PR
3	Minor	ETF process-ref accepts owner pid `0` (`INVALID_PROCESS_ID`)	New in PR
4	Minor	`term_compare` word order for process refs differs from ETF/external	New in PR
5	Minor	Alias→normal conversion drops message on OOM (`reply_demonitor` non-transactional)	New in PR
6	Nit	`context_add_monitor` frees alias via embedded `struct Monitor *`	New in PR
7	Pre-existing	`monitor/3` installs caller-side monitor after unlocking target (DOWN race)	Pre-existing
8	Pre-existing	`mailbox_send_monitor_signal` OOM is unchecked	Pre-existing
—	Positive	`monitor/3` kind-mismatch now unlocks before raising	Fixed in PR

1. Major — `spawn_opt` silently ignores a malformed `{monitor, BadTerm}`

src/libAtomVM/nifs.c#L1572-L1586

if (monitor_term == TRUE_ATOM) {
    monitor_term = term_nil();
}
if (term_is_list(monitor_term)) {
    is_spawn_monitor = true;
    ...
}

monitor_term is interop_proplist_get_value_default(..., MONITOR_ATOM, term_invalid_term()).
For spawn_opt(F, [{monitor, foo}]) or [{monitor, 123}], monitor_term
becomes foo/123 — not a list and not TRUE_ATOM — so it falls through and
the process is spawned with no monitor and no error. OTP raises badarg.
A typo therefore silently disables monitoring.

Suggested fix:

-    if (monitor_term == TRUE_ATOM) {
-        monitor_term = term_nil();
-    }
-    if (term_is_list(monitor_term)) {
-        is_spawn_monitor = true;
-
+    if (!term_is_invalid_term(monitor_term)) {
+        if (monitor_term == TRUE_ATOM) {
+            monitor_term = term_nil();
+        } else if (UNLIKELY(!term_is_list(monitor_term))) {
+            monitor_destroy(new_link);
+            monitor_destroy(self_link);
+            context_destroy(new_ctx);
+            RAISE_ERROR(BADARG_ATOM);
+        }
+        is_spawn_monitor = true;
+
         if (UNLIKELY(term_is_invalid_term(parse_monitor_opts(ctx, monitor_term, &is_alias, &alias_type)))) {

(new_link/self_link are NULL unless link was requested; monitor_destroy(NULL)
is a no-op, so the cleanup is safe.) A test like
{'EXIT', {badarg, _}} = (catch spawn_opt(fun() -> ok end, [{monitor, foo}]))
should accompany it.

2. Major — alias message can be delivered after an earlier `DOWN` that should have deactivated the alias

src/libAtomVM/mailbox.c#L399-L433,
interacting with PROCESS_SIGNAL_MESSAGES in
src/libAtomVM/opcodesswitch.h#L791
and src/libAtomVM/jit.c#L901.

process_outer_list resolves all AliasMessageSignals (converting/
dropping them, and running reply_demonitor side effects) while splitting the
outer list, but other signals (MonitorDownSignal, DemonitorSignal, …) are
only collected and returned for the signal loop to apply afterwards.

So within one drain the alias state is read before older monitor-down
signals are applied. For {alias, demonitor} / {alias, reply_demonitor} the
DOWN deactivates the alias
(context.c#L443-L470).
Trigger sequence in one batch, in received order:

monitored process dies → MonitorDownSignal queued;
a third party sends to the (now-doomed) alias → AliasMessageSignal queued.

OTP applies the DOWN first, deactivates the alias, then drops the alias
message. AtomVM converts and delivers the alias message because the DOWN has
not been processed yet. The PR's own comments stress that alias sends keep
order against plain sends; that holds, but the ordering guarantee does not
extend across the alias-vs-other-signal boundary.

This is a narrow interleaving (requires a DOWN and an alias send in the same
drain, for a non-explicit_unalias alias), so it is unlikely in practice but
is a genuine semantic gap. The clean fix is to handle AliasMessageSignal in
the same received-order stream as the other signals (in
PROCESS_SIGNAL_MESSAGES) rather than resolving aliases up-front in
process_outer_list. That is a non-trivial refactor; at minimum, document the
limitation and add a regression test for the interleaving.

3. Minor — ETF process-reference parse accepts owner pid `0`

src/libAtomVM/external_term.c#L1011-L1022

uint32_t process_id = data[2];
// Reject a malformed process id (the value comes from untrusted input).
if (process_id > TERM_MAX_LOCAL_PROCESS_ID) {
    return term_invalid_term();
}
return term_make_process_reference(process_id, ticks, heap);

0 is INVALID_PROCESS_ID (term.h), the sentinel used everywhere to mean
"short ref / no owner". A deserialized process reference owned by pid 0 can
never name a real alias owner and blurs the short-ref vs process-ref
distinction. Reject it:

-                    if (process_id > TERM_MAX_LOCAL_PROCESS_ID) {
+                    if (process_id == INVALID_PROCESS_ID || process_id > TERM_MAX_LOCAL_PROCESS_ID) {
                         return term_invalid_term();
                     }

4. Minor — `term_compare` word order for process refs differs from ETF / external refs

src/libAtomVM/term.c#L707-L716
and src/libAtomVM/term.c#L793-L812

term_compare lays out a local process ref as
[process_id, ticks_hi, ticks_lo]:

} else if (len == 3) {
    data[0] = term_process_ref_to_process_id(t);
    int64_t t_ticks = term_to_ref_ticks(t);
    data[1] = t_ticks >> 32;
    data[2] = (uint32_t) t_ticks;
    ...

but serialization stores the words as [ticks_hi, ticks_lo, process_id]
(external_term.c#L547-L556),
and that same wire order is what term_get_external_reference_words() returns.

Local-vs-local comparison is internally self-consistent (the test_refs_ordering
expectations encode that order), so this is not a sorting bug for ordinary
local refs. The inconsistency only shows up when a local process ref is
compared against an external 3-word ref in the mixed branch around
term.c:793 — local_data is pid-first while the external words are
ticks-first, giving an order that disagrees with the local-only path. Given
AtomVM's limited distribution support this is low-impact, but aligning the two
layouts (use [ticks_hi, ticks_lo, process_id] in term_compare too) removes
a latent inconsistency and matches the existing len==2 / len==4 conventions
that key on ticks first.

5. Minor — alias→normal conversion drops the message on OOM (`reply_demonitor` becomes non-transactional)

src/libAtomVM/mailbox.c#L410-L433

term message = context_process_alias_message_signal(ctx, CONTAINER_OF(current, struct TermSignal, base));
if (!term_is_invalid_term(message)) {
    MailboxMessage *converted = mailbox_message_create_from_term(NormalMessage, message);
    // Best effort: on out of memory the message is dropped. ...
    if (converted != NULL) { ... }
}
mailbox_message_dispose(current, &ctx->heap);

The code already acknowledges this: for reply_demonitor the alias has been
deactivated and a DemonitorSignal already sent inside
context_process_alias_message_signal, so an allocation failure here loses the
message that those side effects were paying for. The new
mailbox_message_create_from_term allocation is avoidable: the incoming
struct TermSignal already owns a heap-copied term tree, and Message and
struct TermSignal share the { MailboxMessage base; HeapFragment; term }
prefix layout, so the signal can be re-typed as a NormalMessage in place
instead of copying. That both removes the OOM drop and saves an allocation/
free per alias send. (Larger change; flagging the trade-off rather than
prescribing a diff.)

6. Nit — `context_add_monitor` frees an alias monitor through the embedded pointer

src/libAtomVM/context.c#L1042-L1051

case CONTEXT_MONITOR_ALIAS: {
    ...
    if (UNLIKELY(... )) {
        free(new_monitor);   // <- frees via struct Monitor*
        return false;
    }

Every other branch frees the container (free(new_local_monitor), …). This
works only because struct Monitor is the first member of struct MonitorAlias.
Use the dedicated helper for consistency and robustness against future layout
changes:

-                        free(new_monitor);
+                        monitor_destroy(new_monitor);
                         return false;

monitor_destroy recovers the container with CONTAINER_OF, matching the
intent the PR introduced elsewhere.

7. Pre-existing — `monitor/3` installs the caller-side monitor after unlocking the target (DOWN race)

src/libAtomVM/nifs.c#L5171-L5176

mailbox_send_monitor_signal(target, MonitorSignal, other_monitor);
globalcontext_get_process_unlock(ctx->global, target);

context_add_monitor(ctx, self_monitor);
if (is_alias) {
    context_add_monitor(ctx, alias_monitor);
}

Under SMP the target can receive the MonitorSignal, exit, and post its
MonitorDownSignal before the caller runs context_add_monitor(ctx, self_monitor).
The caller then processes a DOWN for which it has no monitor and drops it,
returning a ref that never yields a DOWN. This exact shape exists at
HEAD~27 (caller-side context_add_monitor was already after the unlock), so
it is not introduced by this PR — but the new alias path inherits it, and
fixing it would mean keeping the target lock held until the caller-side
monitor/alias entries are installed. Out of scope for a strict review of this
PR, but worth a follow-up.

8. Pre-existing — `mailbox_send_monitor_signal` OOM is silently ignored

src/libAtomVM/mailbox.c, called at
src/libAtomVM/nifs.c#L5171

mailbox_send_monitor_signal returns void; on malloc failure it logs and
returns, leaking other_monitor and leaving the caller believing the monitor
was installed. Pre-existing (unchanged by this PR), but the alias monitor/3
path now relies on it too. A follow-up could make it return bool and let the
caller clean up + raise OUT_OF_MEMORY.

Positive notes

Lock leak fixed: monitor/3 now calls globalcontext_get_process_unlock
before RAISE_ERROR(BADARG_ATOM) on the process/port kind mismatch
(nifs.c#L5122-L5124);
the baseline leaked the target lock there.
spawn_opt atomicity: all fallible steps (option parsing, monitor/alias
allocation, result-space reservation) now run before any monitor/link is
published, with monitor_destroy on every error path — this correctly
prevents the spurious {'EXIT', Pid, normal} the test
test_spawn_opt_link_monitor_badarg_is_atomic guards against.
Reference shapes: TERM_BOXED_REFERENCE_SHORT/PROCESS/RESOURCE_SIZE are
size-distinguishable on both 32- and 64-bit, the _Static_asserts enforce it,
and the 32-bit resource padding word is initialized in term_from_resource
so GC never copies uninitialized memory.
Alias-via-signal design: validating the alias in the owner's own context
(never from the sender) is the right race-free model; context_destroy uses
the plain mailbox_process_outer_list so it never takes the process-table
lock from the alias path — keep that invariant.
scheduler.c queue reset: resetting processes_list_head after
list_remove so context_destroy's dequeue is a no-op is a clean fix for the
never-scheduled-process case.

Suggested follow-up tests

spawn_opt(F, [{monitor, foo}]) raises badarg (finding 1).
DOWN queued before an alias send in one drain → alias message dropped for
{alias, demonitor} / {alias, reply_demonitor} (finding 2).
Round-trip a process ref through term_to_binary/binary_to_term and compare
ordering against the equivalent local ref (finding 4).
Malformed ETF process ref with owner pid 0 is rejected (finding 3).

Implement Erlang process aliases together with the reference-representation rework they require, rebased onto release-0.7 and squashed into one commit. Public API (libs/estdlib/src/erlang.erl): - alias/0, unalias/1 - monitor/3 with {alias, explicit_unalias | demonitor | reply_demonitor} - spawn_opt {monitor, [monitor_option()]} Runtime: - New process-reference term carrying ref_ticks + owner process id (RefData), alongside short/local references; BEAM-compliant reference ordering and binary_to_term/term_to_binary round-tripping (term.{c,h}, external_term.c). - Monitors store RefData; alias monitors (CONTEXT_MONITOR_ALIAS) with the three alias modes and their lifecycle (context.{c,h}). - send/2, the send opcode and the JIT send path resolve an alias to its target process and drop reply_demonitor aliases after first use (nifs.c, opcodesswitch.h, jit.c). - Reference size macros: REF_SIZE -> TERM_BOXED_REFERENCE_SHORT_SIZE, plus TERM_BOXED_REFERENCE_PROCESS_SIZE and TERM_BOXED_REFERENCE_RESOURCE_SIZE. Tests: alias coverage added to test_monitor, test_refs_ordering and test_binary_to_term.

Sending to a process reference resolved and deactivated the alias from the sending process: it walked and mutated the target's monitor list (context_find_alias / context_unalias) while holding only the processes_table read lock. That lock only guards against the target being freed, not against the target mutating its own monitor list (via alias/0, unalias/1, demonitor or process teardown), so a send racing with a monitor change could corrupt the list or free a node used by another scheduler -- a data race / use-after-free on SMP. Route alias sends through the mailbox instead, the same way monitor and demonitor operations are already delivered across processes. The send paths now post an AliasMessageSignal carrying {Ref, Message} via the shared helper globalcontext_send_message_to_alias; the owner validates the alias against its own monitor list when it drains signals (context_process_alias_message_signal) and either delivers the message or drops it for an inactive alias. All monitor-list access now happens in the owning process, and the three send paths (erlang:send/2, the send opcode and the JIT send) share one helper. Signed-off-by: Davide Bettio <davide@uninstall.it>

A monitor created with {alias, reply_demonitor} previously only deactivated the alias on the first reply through it; the underlying monitor stayed active, so the owner could still receive a stale 'DOWN'. Match OTP: on the first message received through the alias, also remove the monitor and signal the monitored process to drop its side, as erlang:demonitor/1 would. Add a regression test asserting that, after a reply through a reply_demonitor alias, no 'DOWN' is delivered when the monitored process later exits. Signed-off-by: Davide Bettio <davide@uninstall.it>

Aliases require OTP 24, so the alias tests guarded their alias-specific cases behind a runtime OTP-version check (plus a make_ref-based fallback for older BEAM). AtomVM no longer supports OTP < 26, where aliases are always available, so those checks are dead code: run the alias cases unconditionally and drop the is_atomvm_or_otp_version_at_least/1 helper and the mock-alias fallback. Signed-off-by: Davide Bettio <davide@uninstall.it>

For {alias, demonitor} and {alias, reply_demonitor}, the alias is deactivated when the monitor is removed, including the automatic removal when a 'DOWN' is delivered (as erlang:demonitor/1 already does for an explicit demonitor). context_process_monitor_down_signal removed the monitor but left the alias active, so a third party could still reach the owner through the alias after the monitored process had exited. Deactivate the alias (unless it is explicit_unalias) in both the local and registered-name 'DOWN' paths, and add a regression test. Signed-off-by: Davide Bettio <davide@uninstall.it>

nif_erlang_monitor locks the target process, then raises badarg when the object type does not match the target (monitor(process, Port, ...) or vice versa). That error path returned without releasing the processes-table lock, unlike the three sibling out-of-memory error paths in the same function, leaking the lock and deadlocking the next writer on SMP builds. Unlock the target before raising. (A behavioural test would deadlock rather than fail; verified by inspection against the sibling error paths.) Signed-off-by: Davide Bettio <davide@uninstall.it>

Both nif_erlang_monitor and do_spawn published the monitor/alias state (queuing a MonitorSignal to the target and adding monitors to the contexts) and only then reserved heap for the returned reference. An out-of-memory at that last step raised, leaving the monitor half-installed: the target could later send a 'DOWN' for a reference the caller never received. Reserve the result space before publishing, freeing the monitor structs on failure. GC at the new point is safe -- the monitor structs hold only an immediate pid and a plain RefData, none reachable from the heap. Unwinding the separate spawn_opt link side effect on a later OOM is left out of scope. Signed-off-by: Davide Bettio <davide@uninstall.it>

An alias send is delivered via an AliasMessageSignal so the owner validates the alias in its own context (race-free). But mailbox_process_outer_list moved normal messages straight to the inner mailbox while signals were processed afterwards and re-posted, so `Alias ! m1, Pid ! m2` to the same receiver could deliver m2 before m1 -- violating the single-sender message ordering guarantee. Convert alias messages to normal messages in place during the outer-list reverse (new mailbox_process_outer_list_with_aliases), prepending them into the normal stream so they keep send order relative to plain messages from the same sender. context_process_alias_message_signal now returns the payload instead of re-posting it. Teardown and the port scheduler keep the plain mailbox_process_outer_list (which returns alias signals to drop): they must not run the reply_demonitor path, which takes the processes-table lock that context_destroy already holds. Add a regression test (alias send then pid send keep order). Signed-off-by: Davide Bettio <davide@uninstall.it>

term_compare sizes a process reference as 3 words (process_id + ref_ticks) in the local-vs-local branch, but the mixed local-vs-external branch still sized it as 2 words, dropping process_id. A process reference was therefore ordered differently depending on whether it was compared against a local or an external reference, which can make the term order inconsistent. (The reviewer's "two process refs with equal ref_ticks" case cannot actually occur: ref_ticks come from a global monotonic counter, so they are unique.) Size process references as 3 words in the mixed branch too and add the matching extraction, mirroring the local-vs-local logic. Exercising this path requires an external reference (distribution), so it is verified by inspection. Signed-off-by: Davide Bettio <davide@uninstall.it>

binary_to_term of a NEWER_REFERENCE_EXT with len == 3 turned the third word straight into a process reference id without bounding it. Note this was not a memory-safety issue as the reviewer suggested: the id flows into globalcontext_get_process_lock, which performs a linear search of the process table (no out-of-bounds indexing), so an out-of-range value simply matches no process. Still, validate it like a local pid (TERM_MAX_LOCAL_PROCESS_ID) to reject clearly malformed input early. Signed-off-by: Davide Bettio <davide@uninstall.it>

monitor(process, X, [{alias, ...}]) returned a plain short reference (no alias) when X was self() or an already-dead process, so the returned reference could not be used to send to the process or be passed to unalias/1. OTP returns a usable alias in both cases. Install the alias and return a process reference on both early-return paths. For self the alias stays active (self never sends a DOWN). On the noproc path the monitor is immediately removed by the noproc DOWN, so only an explicit_unalias alias stays active (demonitor / reply_demonitor are deactivated right away, as at a real DOWN). Add regression tests for both. Signed-off-by: Davide Bettio <davide@uninstall.it>

A send to a reference that is not an active local process-reference alias (a plain/short ref, a resource ref, or an external ref) is silently dropped, which matches OTP for local references. Distributed aliases (external references) are not supported, so a message addressed to a remote alias is lost rather than routed over distribution. Add comments at the three send sites (erlang:send/2, the send opcode and the JIT send); no behaviour change. Signed-off-by: Davide Bettio <davide@uninstall.it>

On 32-bit, TERM_BOXED_REFERENCE_RESOURCE_SIZE is 5 (one more than header + refc + mso cons) so reference shapes stay size-distinguishable, but term_from_resource only wrote 4 words, leaving the trailing padding word uninitialized. It is not read by comparison or serialization, but it is copied during GC, which is an uninitialized-memory read (sanitizer / valgrind). Initialize it to nil. No effect on 64-bit, where the resource size is exactly 4 words. Signed-off-by: Davide Bettio <davide@uninstall.it>

…onitor The CONTEXT_MONITOR_RESOURCE case fell through to the following monitor cases when the ref_ticks did not match. It is benign today (the next cases only break), but the alias commit added CONTEXT_MONITOR_ALIAS to the same switch, so tighten the fall-through. No behaviour change. Signed-off-by: Davide Bettio <davide@uninstall.it>

globalcontext_init_process appends every new context to the scheduler waiting queue, but context_destroy never removed it. Destroying a process that was never scheduled -- as the spawn NIFs do on option errors, or port drivers on initialization failures -- left a dangling queue entry pointing into freed memory, corrupting the queue on the next scheduler operation. Pre-existing upstream; exposed by the spawn_opt atomicity test added in the next commit, the first to exercise such a path. Dequeue in context_destroy under the processes spinlock, and reset the queue item at the scheduler's own dequeue sites so the destroy-time dequeue is a no-op there. A concurrent re-queue cannot happen: scheduler_make_ready refuses Killed and Spawning processes under the same spinlock, and one of these flags is set on every destroy path. Signed-off-by: Davide Bettio <davide@uninstall.it>

do_spawn published the link before the monitor phase, which can still fail parsing the monitor options or on out-of-memory, including the result-space reservation. The error paths destroyed the new context but left the caller's link entry installed, so the destroyed context's teardown sent the caller a spurious {'EXIT', Pid, normal} -- delivered as a message when trapping exits -- and process_info(links) reported a link to a dead pid, for a spawn_opt call that raised. OTP's spawn_opt is atomic: either it returns and the link exists, or it raises without side effects. Run every fallible step -- option parsing, link and monitor allocations, and the result space reservation -- before publishing anything; publishing itself cannot fail. This closes the unwind residual left out of scope by the result-space commit. Add a regression test checking that spawn_opt(F, [link, {monitor, [Bad]}]) raises badarg without leaving a link behind or delivering an 'EXIT' message. Signed-off-by: Davide Bettio <davide@uninstall.it>

The BEAM conformance CI run (test-erlang -b) showed that on OTP, monitor(process, self(), [{alias, _}]) is a no-op apart from returning a fresh reference: no monitor is installed and no alias either -- sends to the returned reference are dropped, and unalias/1 and demonitor(Ref, [info]) return false (verified on OTP 29). This reverts the self half of "Return a real alias from monitor/3 on the self and noproc paths", whose premise was wrong for self; the noproc half is conformant (its test passes on the BEAM). Return a plain short reference and install nothing, matching the existing monitor/2 self path, and update the test to assert the OTP behaviour. Signed-off-by: Davide Bettio <davide@uninstall.it>

On the BEAM the test process is the erl_eval process, which is linked to init, so the {links, []} assertion failed when running the tests with test-erlang -b. Snapshot the links before the spawn_opt call and assert they are unchanged after; this still catches a leaked link on AtomVM and is independent of the harness process's pre-existing links. Signed-off-by: Davide Bettio <davide@uninstall.it>

The monitor_*_new constructors return a pointer to the struct Monitor embedded in their container struct, and the nifs.c error paths free()d that pointer directly. That is only correct because the monitor happens to be the first member of every container struct; recover the container with CONTAINER_OF instead, through a new monitor_destroy that switches on the monitor type, so a layout change cannot silently corrupt the heap. No functional change. Signed-off-by: Davide Bettio <davide@uninstall.it>

The RefData arguments of the monitor constructors and term_from_ref_data are only read, so take them as const pointers per the coding style. Document monitor_alias_new, term_from_ref_data and term_process_ref_to_process_id like their neighbours, and use the until-now unused REFERENCE_PROCESS_PID_OFFSET for the process id word of a process reference, which also drops a TERM_BYTES conditional. No functional change. Signed-off-by: Davide Bettio <davide@uninstall.it>

erlang:send/2 accepts process alias references since the aliases were introduced, but send_destination() only allowed pids, ports and atoms. Add reference() like OTP does, document the alias destination (and the silent drop of non-alias references) on send/2, and mention the {monitor, MonitorOpts} option and the {Pid, MonitorRef} result in the spawn_opt docs. Signed-off-by: Davide Bettio <davide@uninstall.it>

Signed-off-by: Davide Bettio <davide@uninstall.it>

Nothing ever referenced it: reference flavors are distinguished by their boxed size (see the comment above TERM_BOXED_REFERENCE_SHORT_SIZE), so the enum was dead code. It also didn't follow the coding style, which wants enumerations typedef'd with a lower_snake_case _t name. Signed-off-by: Davide Bettio <davide@uninstall.it>

The alias feature commit accidentally dropped the space in "// TODO" while sweeping REF_SIZE through network_driver.c. Restore the original comment; the line is unrelated to process aliases and clang-format flags the new form. Signed-off-by: Davide Bettio <davide@uninstall.it>

unalias/1 validated its argument with term_is_local_reference, raising badarg for a reference from another node. OTP 29 returns false instead (an external reference can never be an alias of the calling process); only non-references raise badarg. Sends to external references are already silently dropped, like OTP when the node is not alive; the new test pins both behaviours, BEAM-validated. Signed-off-by: Davide Bettio <davide@uninstall.it>

The console driver rebuilt the io_reply reference from the request reference ticks. That erases the flavor of a process reference: an alias passed as ReplyAs came back as a short reference with the same ticks, which the requester's receive does not match, so an io request through an alias hung. It also crashed on a non-reference ReplyAs (the io protocol allows any term) by reading ref ticks from an arbitrary term. Put the incoming ReplyAs term in the reply instead: it lives in the request message storage until the handler returns and port_send_message copies it. This also drops a reference allocation from every console reply. Signed-off-by: Davide Bettio <davide@uninstall.it>

The select machinery (socket nowait handles, enif_select refs used by the posix nifs) stores only the reference ticks and rebuilds the handle as a plain reference in the select notification. An alias passed as a handle would therefore come back as a reference the alias does not match, and the caller would hang waiting for the notification. On the BEAM an alias is a valid select handle and is echoed back verbatim. Supporting that would mean carrying the owner process id through the select machinery for a corner case with no practical use, so raise badarg instead and keep the code simple; the divergence is called out in comments at both validation sites. Signed-off-by: Davide Bettio <davide@uninstall.it>

The alias tests used a single 500 ms recv_one window both for positive receives (the file convention elsewhere is 5000 ms) and for asserting that a message does NOT arrive. Under valgrind or a loaded CI runner the positive uses are the most likely false failure, and the two negative uses were detection windows a reintroduced bug could outlast. Raise recv_one to 5000 ms and assert the negatives behind deterministic fences instead: a later monitor whose DOWN fires after the buggy one (monitors fire in installation order), or a fence reply through the same echo process (sends from one process keep their order). Bound the one unbounded receive, and document that the same-batch DOWN test exercises the same-batch path reliably only on SMP builds. Signed-off-by: Davide Bettio <davide@uninstall.it>

Quit the echo workers the alias tests spawn (the file convention; harmless today only because each module runs in a fresh VM), rewrap a comment that exceeded the 100-column print width, and restore the blank line between two test functions in test_binary_to_term. Signed-off-by: Davide Bettio <davide@uninstall.it>

The helper shadowed the auto-imported spawn_monitor/2 BIF and made erlc emit seven "ambiguous call of overridden auto-imported BIF" warnings on every build of the test sources. Resolution picked the local function, so behavior was correct, but the warnings were noise this branch introduced. Signed-off-by: Davide Bettio <davide@uninstall.it>

The demonitor mode also deactivates on DOWN delivery (not only on demonitor/1), and reply_demonitor removes the monitor as well as the alias when the first message sent via the alias is delivered (verified against OTP 29). Also note that the OTP {tag, Term} option raises unsupported. Signed-off-by: Davide Bettio <davide@uninstall.it>

Two conformance cells the alias tests did not pin down yet: an alias used as a map and ets key (ordinary reference term handling, value comparison), and demonitor(Mon, [flush]) on an {alias, demonitor} monitor, where the alias died with the monitor at DOWN delivery and flush removes the queued DOWN. Both BEAM-validated. Signed-off-by: Davide Bettio <davide@uninstall.it>

context_find_alias walks the whole monitor list and runs for every DOWN delivery (twice: once in the outer-list split for same-batch ordering, once in the down handler) and for every demonitor -- also for the vast majority of processes that never create an alias. In a DOWN-heavy benchmark (50 batches of 1000 monitored process exits) this cost 25% over the pre-alias baseline. Track the number of active aliases in a 16-bit slice of the existing Context bitfield word (no struct growth, owner-written only like the surrounding bits) and short-circuit context_find_alias when it is zero. The benchmark returns to the baseline noise envelope; processes that do use aliases keep the previous behavior. Signed-off-by: Davide Bettio <davide@uninstall.it>

Pin two more behaviors verified against OTP 29: when {alias, _} appears twice in the monitor options the last occurrence wins, and monitoring the caller through its own registered name installs nothing, exactly like monitoring self() directly. Signed-off-by: Davide Bettio <davide@uninstall.it>

alias/0 becomes alias([]) with explicit_unalias as the default mode. The reply option needs no new machinery: it maps onto the reply_demonitor alias type, whose conversion path deactivates the alias when the first message sent via it is delivered -- with no monitor to remove that is exactly the OTP reply behavior (verified against OTP 29, including that a second same-batch message is dropped). Bad options and a non-list argument raise badarg before any side effect. Signed-off-by: Davide Bettio <davide@uninstall.it>

The JIT send sets x0 to the message in every delivering branch, but the fall-through that silently drops a send to a non-alias reference left x0 holding the recipient, so Ref ! Msg returned the reference instead of Msg on the JIT flavor (the interpreter assigns x0 after the dispatch and was correct). Caught by the new non-local-refs send test, which only ran on the interpreter flavor until now. Signed-off-by: Davide Bettio <davide@uninstall.it>

Reference kinds are distinguished by their boxed size, and on 32-bit the natural process reference size (header + 2 ticks words + pid = 4) collides with the resource reference. Resolving the collision by growing the resource reference put the extra word on the wrong side: resource references are used all over the embedded targets (sockets, files, NIF resources), while process references only exist when aliases are used. Move the padding word to the process reference, so 32-bit resource references return to 4 words and only alias users pay the extra word. The padding word is initialized to nil so GC copies a defined value. The size static asserts now check pairwise distinctness, since the sizes are no longer monotonically ordered (32-bit: short 3, resource 4, process 5; 64-bit: short 2, process 3, resource 4). The wire format is unchanged (the padding word is not serialized), and all allocation sites size through TERM_BOXED_REFERENCE_PROCESS_SIZE. Signed-off-by: Davide Bettio <davide@uninstall.it>

…of RefData Storing a RefData (ref ticks + owner pid) in every local monitor grew each MonitorLocalMonitor, MonitorLocalRegisteredNameMonitor and MonitorAlias by 8 bytes (uint64_t alignment), paid by every plain monitor/2 -- i.e. every gen_server:call -- on the embedded targets. The pid was redundant everywhere: on the monitored side it duplicates monitor_obj (the monitoring process, which is exactly the alias owner), on the monitoring, registered-name and alias entries it was never read. Keep a plain uint64_t ref_ticks in the monitor structs (their baseline layout) and carry the only new bit of information -- "the 'DOWN' reference must be rebuilt alias-shaped" -- as a monitor type, CONTEXT_MONITOR_MONITORED_LOCAL_ALIAS. The 'DOWN' path derives the process reference pid from monitor_obj. monitor_new now takes the monitor type directly, replacing the is_monitoring bool as well. RefData stays as the NIF-local value type used to build result references (term_from_ref_data) in monitor/3, spawn_opt and alias/1. Behavior is unchanged and pinned by the existing tests (test_monitor_down_alias asserts the 'DOWN' carries the very alias reference; process_info(monitored_by) keeps listing alias-shaped monitors). Signed-off-by: Davide Bettio <davide@uninstall.it>

The 16-bit active_alias_count was incremented and decremented unchecked: a process accumulating 65536 simultaneously active aliases would wrap the counter to 0, after which context_find_alias short-circuits and every alias of that process silently stops matching -- messages dropped, unalias/1 returning false, reply_demonitor monitors no longer removed. Silent message loss is the worst failure mode for aliases. Make the counter saturate stickily: at 0xFFFF a new bit, alias_count_saturated, is set and the count is pinned (decrements are skipped). Since the pinned count is never 0, the context_find_alias fast path needs no change -- a saturated process simply always walks its monitor list, degrading the optimization instead of correctness. The bit packs into the existing bitfield word, so sizeof(Context) is unchanged. Not covered by a test: creating 65536 live aliases is infeasible in CI (context_add_monitor's duplicate scan makes it quadratic); the three touched sites (add, unalias, terminate) were reviewed for symmetry instead. Signed-off-by: Davide Bettio <davide@uninstall.it>

Aliases are serialized as 3-word NEWER_REFERENCE_EXT, so a BEAM peer can legitimately hold an AtomVM alias and send to it, emitting DOP_ALIAS_SEND (33) or DOP_ALIAS_SEND_TT (34). Those operations hit the default arm of nif_erlang_dist_ctrl_put_data, which raises badarg in the distribution connection process: a single remote alias send could take the whole connection down instead of, at worst, losing one message. Handle both operations by routing through the existing alias delivery path: decoding our own alias yields a process reference (node and creation match), so the owner pid is in the reference itself and the message is posted with globalcontext_send_message_to_alias, where the owner validates the alias as for a local send. A reference that is not a local process reference (e.g. minted by a previous incarnation of this node) drops the message, exactly like a local send to a non-alias reference. Unlike a pid, the reference is a boxed term, so the control message is kept rooted across the payload decode. The trace token of DOP_ALIAS_SEND_TT is ignored (sequential tracing is not supported). The outbound direction is unchanged: sending to a remote alias from AtomVM is not routed over distribution. test_net_kernel gains test_alias_send_from_beam, which has a real OTP node send through an AtomVM alias (delivery and reference round-trip), then send again after unalias/1 (message dropped, connection survives). The test also passes unchanged on BEAM, pinning the OTP semantics it asserts. Signed-off-by: Davide Bettio <davide@uninstall.it>

Behavior differences go to the differences-with-beam page: sending to a remote alias is not routed over distribution (the inbound direction works), and aliases are rejected as select handles where BEAM accepts them. The unsupported erlang:monitor/3 {tag, Term} option is already stated in the function documentation, and the builtin port-driver request/reply protocols are AtomVM-specific, so neither belongs on the differences page. The alias/1 documentation now states that the OTP 28 priority option is not supported and raises badarg, and the CHANGELOG records the REF_SIZE C macro deprecation with its replacements. Signed-off-by: Davide Bettio <davide@uninstall.it>

The received-order split added for process aliases reverses the LIFO outer list and then walks it a second time, so that alias side effects run in received order (deactivating the alias, and resolving several sends to the same alias in one batch oldest-first). That second traversal only matters when the owner actually holds an active alias, yet it ran for every signal drain of every process -- including the alias-free processes that are the common case on the embedded targets, and the ctx == NULL scheduler and termination callers that can never own an alias. Gate it on active_alias_count: when it is zero (or there is no owner context), split the outer list into the normal and signal sublists in a single prepend pass, exactly as before aliases existed. active_alias_count is written only by the owner and this runs in the owner's own context, so it is stable for the whole batch and the choice is race-free. An AliasMessageSignal reached on this path necessarily targets an inactive alias: drop it when we own a context (it must never reach the signal loop, which treats it as unreachable), or leave it for the ctx == NULL caller to ignore. The full received-order two-pass is kept unchanged for a process that holds an active alias. No behavior change: the fast path produces the same received-order sublists the two-pass produced for an alias-free batch. Signed-off-by: Davide Bettio <davide@uninstall.it>

The alias support reduced the console handler's heap reservation from a conservative 12 terms to TUPLE_SIZE(3), matching the io_reply 3-tuple once the verbatim ReplyAs echo removed the separate ref allocation. But the gen_call paths build their error reply as an error 2-tuple nested inside the reply 2-tuple (port_create_error_tuple + port_send_reply), i.e. TUPLE_SIZE(2) + TUPLE_SIZE(2) = 6 terms, allocated on ctx's heap by helpers that do not run their own memory_ensure_free. Reserving only 4 terms under-allocates by 2: the writes are masked by heap slack today, but violate the ensure_free contract and corrupt the heap whenever it is tight (e.g. after a MEMORY_CAN_SHRINK shrink to the requested size). Reserve 2 * TUPLE_SIZE(2), the largest reply the handler builds (more than the io_reply 3-tuple and the close 2-tuple). Flagged by CodeQL's allocation-exceeding-ensure_free query. Signed-off-by: Davide Bettio <davide@uninstall.it>

pguyot · 2026-06-12T16:29:18Z


+term context_process_alias_message_signal(Context *ctx, struct TermSignal *signal)
+{
+    // signal->signal_term is a 2-tuple {Ref, Message}. The alias lookup runs here, in the owner's own


Do we want comments that verbose? The drawback is that eventually they end up being unsync with code, and LLM or humans tend to read comments instead of code.

pguyot · 2026-06-12T16:30:30Z

                // Target cannot be NULL as we processed Demonitor signals
                assert(target != NULL);
-                int required_terms = REF_SIZE + TUPLE_SIZE(5);
+                int required_terms = TERM_BOXED_REFERENCE_PROCESS_SIZE + TUPLE_SIZE(5);


This size could be conditional of monitor->monitor_type == CONTEXT_MONITOR_MONITORED_LOCAL_ALIAS, couldn't it?

pguyot · 2026-06-12T16:35:20Z

+    // lookups are no longer skipped for this process, trading the optimization for
+    // correctness instead of wrapping to 0 and silently deactivating every alias.
+    unsigned int active_alias_count : 16;
+    unsigned int alias_count_saturated : 1;


Not sure this optimization is worth the 17 bits it costs. We could have only one bit and check if there is any alias left when removing an alias. Alternatively, we could reduce it to 8 bits and drop the optimization when the number of aliases reaches (forever) 255.

pguyot · 2026-06-12T16:39:54Z

+{
+    Context *p = globalcontext_get_process_lock(glb, process_id);
+    if (p) {
+        // Post the alias message as a signal carrying {Ref, Message}. The owner validates the alias


I don't think this comment adds any value

pguyot · 2026-06-12T16:40:42Z

+ *
+ * @details Posts an AliasMessageSignal carrying {Ref, Message} to the owner process. The owner
+ * validates the alias against its own monitors when it drains signals and either delivers Message
+ * as a normal message or drops it. This avoids touching the owner's monitor list from the sender.


Last sentence sounds obvious here.

pguyot · 2026-06-12T16:41:29Z

+        set_error(ctx, jit_state, 0, BADARG_ATOM);
+        return false;
+    } else {
+        // A reference that is not a local process reference (short/resource/external ref): the


A TODO would be welcome for distributed aliases.
Also I would put the term_is_reference first.

pguyot · 2026-06-12T16:43:53Z

        }
-
        int local_process_id;
        if (term_is_local_pid_or_port(recipient_term)) {


Does this test still make sense?

The alias work renamed REF_SIZE to TERM_BOXED_REFERENCE_SHORT_SIZE and left REF_SIZE as a deprecation shim that emits a GCC diagnostic on every use. The libAtomVM tree was fully migrated, but the three USB CDC drivers were not: ten uses remained, so every esp32, stm32 and rp2 build gained deprecation warnings, turning into hard errors when AVM_WARNINGS_ARE_ERRORS is enabled (the drivers inherit libAtomVM's compile options). The generic_unix builds never compile these drivers, which is how the leftovers survived every native test pass. All ten uses size reply allocations whose reference is built with term_from_ref_ticks, i.e. a plain short reference, so TERM_BOXED_REFERENCE_SHORT_SIZE is the exact replacement. The shim expands to that same macro, so the generated code is unchanged and the deprecation is left with zero in-tree users. Flagged by the pre-merge review, which confirmed with a standalone probe that the pragma fires in exactly these expression contexts and becomes a hard error under -Werror. Signed-off-by: Davide Bettio <davide@uninstall.it>

The differences-with-beam page documents core divergences from the BEAM, and the process alias section was not one: its remaining substance is a distribution limitation. Move it to a new Known Issues & Limitations section of the distributed Erlang page, which now carries the only user-facing note: an outbound send to a remote alias is dropped, while inbound delivery from a BEAM peer works. The page-closing invitation to file issues or pull requests moves below the new section, so it keeps closing the page and now reads on both the partial feature list and the known issues. Dropped in the move: the select-handle rejection (an exotic corner that is not worth user-facing documentation -- it remains commented at both validation sites and covered by tests) and the OTP >= 26 semantics claim. The differences-with-beam page is back to its release-0.7 content. Signed-off-by: Davide Bettio <davide@uninstall.it>

Five tests closing the review's coverage gaps (AR-10): - test_alias_send_after_owner_died: a send through the alias of a dead process is dropped, still returns the message, and surfaces nowhere; the spawn churn probes would report a misdelivery. AtomVM assigns process ids monotonically so the dead owner's id is not reused, and globally-unique ref ticks are the invariant that keeps a stale alias unmatchable even by an id reuser. - test_alias_multi_sender_unalias: four senders hammer one alias while the owner unaliases mid-stream. All synchronization is per-sender send order and explicit acks -- no sleeps and no timing windows -- so the test cannot flake on slow or loaded hosts. Phase 1 pins the exact delivered count behind per-sender fences while the alias is active; phase 2 races unalias/1 against the senders and pins no crash, a bounded count and an observably dead alias afterwards. - test_alias_duplicate_options: alias/1 duplicate options are last-wins in both orders, like OTP 29 (the monitor/3 duplicate was already covered). - test_unalias_non_reference_badarg: unalias(42) raises badarg, like OTP. - test_binary_to_term_invalid_process_ref: a wire-format alias with its owner pid word patched to 0 or above the 28-bit maximum must fail to decode -- the untrusted-input validation this branch added to the decoder, previously untested. The BEAM does not read reference words as a pid and decodes plain references there; the test asserts that instead. All five pass on both flavors and on the BEAM (OTP 29) via -b; the module was looped 30x on each flavor (0 failures) and runs under valgrind with 0 errors. Signed-off-by: Davide Bettio <davide@uninstall.it>

Process ids are int32_t throughout (Context.process_id, RefData.process_id, and every assignment of this accessor's result), but term_process_ref_to_process_id returned uint32_t. Return int32_t, making the six int32_t call sites exact; the remaining sites store the always-positive, 28-bit-bounded value into uint32_t wire words, where the conversion is well-defined and unchanged. Also from the review's style pass: drop the stray blank line in the REFERENCE_PROCESS_PID_OFFSET conditional (the neighboring blocks are compact), fix "an unique" in the term_make_process_reference doc, and state the load-bearing layout invariant there: the ticks occupy the same words as in a short reference, so term_to_ref_ticks works on both shapes. The remaining nits from the same review pass were deliberately not taken: explicit casts in term_make_process_reference would decorate conversions that are well-defined and always in range; initializing the parse_monitor_opts alias_type out-param adds a line for a latent-only risk; the context_dump %lu ticks truncation matches the surrounding pre-existing lines and is best fixed together with them. Signed-off-by: Davide Bettio <davide@uninstall.it>

A message sent to an inactive alias is rejected at the owner's mailbox drain, but both drop sites disposed it with mailbox_message_dispose, which appends the signal storage to the receiver's heap as a fragment: the dead {Ref, Message} tree -- including any refc binaries it pins -- stayed on the victim's heap until its next GC. OTP drops such messages with no receiver heap impact, so a flood of sends to a deactivated alias produced AtomVM-only GC pressure, with the receiver paying for messages it never accepted. The deferred reclamation is only needed for delivered messages, whose terms may still be referenced after a receive. A dropped alias message was never delivered, so nothing can reference its term: free it immediately with mailbox_message_dispose_unsent (which sweeps the mso_list, releasing refc binaries), exactly the treatment unsent messages already get. TermSignal and Message share the layout already asserted for the in-place conversion, and both drop sites run in the owner's own context (from_task = false). The reply-mode and duplicate-option tests now also push a >64-byte binary through each drop path, so the refc sweep is exercised under valgrind and ASAN. Flagged by the pre-merge review (round 2, issue 4). Signed-off-by: Davide Bettio <davide@uninstall.it>

The reply_demonitor branch of context_process_alias_message_signal takes the processes_table read lock from inside the mailbox drain, which is why callers of the with-aliases drain must not hold that lock and why context_destroy stays on the alias-blind variant. The constraint was documented at the mailbox.h API; nothing at the acquisition site itself said so, and the pre-merge review (round 2, issue 5) asked for a guard against a future caller draining with the lock held. A guard already exists, stronger than the suggested debug assertion: on generic_unix smp_rwlock_rdlock aborts on any pthread error, and glibc returns EDEADLK when the calling thread already owns the write lock (probe-verified), so the misuse dies fast at the exact lock call in every build type instead of deadlocking silently. A literal current-thread ownership assertion would need thread-local tracking maintained at all 22 processes_table lock sites across the four platform smp implementations -- out of proportion for a fragility note. State the invariant and its enforcement in a comment at the acquisition site, so the coupling is visible where a future reader would break it. Signed-off-by: Davide Bettio <davide@uninstall.it>

The owner in test_monitor_alias_down_before_send_same_batch busy-waits on whereis by design: receiving would drain its mailbox and defeat the same-batch construction. Under valgrind all threads are serialized with an unfair scheduler, so the spinning owner can starve the relay; the 5M-iteration bound expired once in a full-suite valgrind run (the test passes standalone, on the other flavor, and on rerun, with zero memory errors every time). Raise the bound to 50M -- ten times the scheduling opportunities for the relay -- and say why it is large. The fast path is unchanged: the loop exits as soon as the relay registers. Signed-off-by: Davide Bettio <davide@uninstall.it>

petermm · 2026-06-12T18:13:43Z

AMP:

PR Review — Process aliases for AtomVM

Range reviewed: f90e0ee9a..HEAD (62 commits), branched from
9402220d7 (Merge pull request #2331 … valgrind-spinlock-sched).

Diffstat: ~40 files, +1975 / −173. The series adds OTP process aliases:
erlang:alias/0,1, erlang:unalias/1, erlang:monitor/3 with the
{alias, Mode} option, spawn_opt {monitor, MonitorOpts}, sending to an
alias reference (local + inbound distribution), a new boxed process
reference term, and per-process saturating alias counters.

Verdict

Strong, ship-quality work. The design is sound: aliases are validated in
the owner's own context against the owner's own monitor list (the same
single-owner discipline as alias/0/unalias/1/demonitor), so there is no
cross-process race on alias state. Error paths in do_spawn/monitor are
carefully "allocate-everything-then-publish", the 32-bit reference padding and
size-distinctness are guarded by _Static_assert, and the commit history shows
the tricky cases (same-batch ordering, saturation, wire ordering in
term_compare, REF_SIZE deprecation) were each addressed deliberately. Test
coverage is substantial (tests/erlang_tests/test_monitor.erl +672).

The findings below are mostly pre-existing OOM-robustness gaps that this
feature widens slightly, plus one portability note. None block merge; I'd fix
#1 in this series since the alias code adds to the half-published surface.

Findings

1. (Medium) OOM in `mailbox_send_monitor_signal` leaves a half-published monitor/alias

nifs.c:5187,
helper at mailbox.c:338.

mailbox_send_monitor_signal() returns void and silently drops the signal on
malloc failure (the FIXME at mailbox.c:340 acknowledges this). In
nif_erlang_monitor the call is the publish step for the target's half of the
monitor. If it fails:

other_monitor is leaked,
self_monitor and (new in this PR) alias_monitor are still added right
after at nifs.c:5190-5192,
the caller still receives a reference,
the target never gets the monitor.

So the monitoring process believes it monitors / has an alias to a target that
knows nothing about it. The alias addition makes the inconsistency worse (a
live alias with no monitored half).

Suggested fix — make the helper report failure and clean up at the call site:

--- a/src/libAtomVM/mailbox.h
+++ b/src/libAtomVM/mailbox.h
-void mailbox_send_monitor_signal(Context *c, enum MessageType type, struct Monitor *monitor);
+bool mailbox_send_monitor_signal(Context *c, enum MessageType type, struct Monitor *monitor);

--- a/src/libAtomVM/mailbox.c
+++ b/src/libAtomVM/mailbox.c
-void mailbox_send_monitor_signal(Context *c, enum MessageType type, struct Monitor *monitor)
+bool mailbox_send_monitor_signal(Context *c, enum MessageType type, struct Monitor *monitor)
 {
     struct MonitorPointerSignal *monitor_signal = malloc(sizeof(struct MonitorPointerSignal));
     if (IS_NULL_PTR(monitor_signal)) {
-        // FIXME (pre-existing): this function returns void, so an out-of-memory here is silently
-        // dropped -- the monitor is leaked and the caller believes it was installed. It should return
-        // bool so the caller can free the monitor and raise out_of_memory.
         fprintf(stderr, "Failed to allocate memory: %s:%i.\n", __FILE__, __LINE__);
-        return;
+        return false;
     }
     ...
     mailbox_post_message(c, &monitor_signal->base);
+    return true;
 }

--- a/src/libAtomVM/nifs.c
+++ b/src/libAtomVM/nifs.c
-    mailbox_send_monitor_signal(target, MonitorSignal, other_monitor);
-    globalcontext_get_process_unlock(ctx->global, target);
+    if (UNLIKELY(!mailbox_send_monitor_signal(target, MonitorSignal, other_monitor))) {
+        monitor_destroy(alias_monitor);
+        monitor_destroy(self_monitor);
+        monitor_destroy(other_monitor);
+        globalcontext_get_process_unlock(ctx->global, target);
+        RAISE_ERROR(OUT_OF_MEMORY_ATOM);
+    }
+    globalcontext_get_process_unlock(ctx->global, target);

The other callers (nifs.c:5275 link, dist_nifs.c, resources.c) then need
to either check the return or explicitly ignore it with cleanup. If a full
audit is out of scope for this PR, at minimum guard the new alias-bearing path
above so the alias is not left dangling.

2. (Low) Same OOM class on `mailbox_send` / `mailbox_send_ref_signal`

noproc monitor branch: nifs.c:5133 installs the
explicit_unalias alias before mailbox_send(ctx, down_message_tuple); if
that send OOMs the DOWN is lost but the alias remains (tolerable, since an
explicit-unalias alias lives until unalias/1).
reply_demonitor: context.c:527
mailbox_send_ref_signal(target, DemonitorSignal, …) can drop the remote
demonitor on OOM, leaving the monitored side installed until target exit.

Same root cause as #1 (void mailbox-send helpers). Worth a tracking note rather
than a blocker.

3. (Low, pre-existing) Registered-name `DOWN` writes a temp-heap tuple into the retained signal

context.c:470-476:

BEGIN_WITH_STACK_HEAP(TUPLE_SIZE(2), temp_heap)
term name_tuple = term_alloc_tuple(2, &temp_heap);
...
term_put_tuple_element(signal->signal_term, 3, name_tuple);  // temp-heap term into the signal
mailbox_send(ctx, signal->signal_term);
END_WITH_STACK_HEAP(temp_heap, ctx->global);

mailbox_send copies the term out before the temp heap is torn down, so the
delivered message is fine. But signal->signal_term now holds a pointer into
freed stack-heap storage, and the original signal fragment is later appended to
ctx->heap by mailbox_message_dispose, where a future GC could scan the
dangling slot.

This code is unchanged by this PR (it predates f90e0ee9a; the series only
renamed ref_data.ref_ticks → ref_ticks here), so it is out of scope — noted
for a follow-up. Fix is to build a fresh 5-tuple on the temp heap and send that
instead of mutating signal->signal_term.

4. (Low / portability) In-place re-type of `TermSignal` → `Message`

mailbox.c:458-459
(and the CONTAINER_OF(current, Message, base) disposal casts at lines 403,
472) re-type an AliasMessageSignal in place. The _Static_asserts above the
function prove identical layout, so this works on every current toolchain, but
identical layout does not by itself satisfy C effective-type / strict-aliasing
rules. Low risk; the clean long-term fix is to unify Message and TermSignal
into one heap-backed mailbox term with a union { term message; term signal_term; }. Not needed for merge.

Things verified as correct (not bugs)

Unlocked active_alias_count read in process_outer_list
(mailbox.c:388):
owner-written only, runs in owner context, stable for the batch. Fast path
(no alias) preserves the original single-pass reversal; alias path reverses
to received order first. Sound.
Same-batch ordering: pre-deactivating an alias on a MonitorDownSignal
seen earlier in the batch (mailbox.c:483)
so a later same-batch alias send is dropped; idempotent with the later
context_process_monitor_down_signal. Matches OTP.
Saturating counter (context.h): pins at 0xFFFF and stops decrementing,
so context_find_alias never wrongly short-circuits — correctness over the
optimization. Sound.
Process-reference sizing (term.h): 64-bit = SHORT+1, 32-bit = SHORT+2
with the trailing pad word initialized to nil; all reference kinds stay
size-distinguishable, guarded by _Static_assert. Sound.
reply_demonitor lock discipline (context.c): monitor_pid captured as
an immediate before context_demonitor frees the alias; target taken under
the process-table lock; context_destroy deliberately uses the alias-blind
drain to avoid taking that lock while holding the write lock. No UAF/deadlock.
Inbound distributed alias send (dist_nifs.c, external_term.c): control
tuple kept rooted across payload-decode GC; decode rejects process_id == 0
(the INVALID_PROCESS_ID short-ref sentinel) and > TERM_MAX_LOCAL_PROCESS_ID;
delivery only for node+creation match.
term_compare wire ordering: local process refs laid out as
[ticks_hi, ticks_lo, process_id] to agree with the serialized form and the
external-ref path; the mixed local/external len==3 branch never has both
operands local (guarded by the outer is_external check), so the shared
local_data scratch buffer is not a hazard.
do_spawn: all parsing/allocation/result reservation before any publish,
monitor_destroy on every error path, monitor_destroy(NULL) no-op. No leak
or half-published state found.
REF_SIZE deprecation: no remaining REF_SIZE uses in the tree, so the
_Pragma("GCC warning …") will not fire (safe even under -Werror).
CHANGELOG documents both the feature and the deprecation.

Suggested follow-up tests

Same-batch: alias-send-before-DOWN (delivered) vs DOWN-before-alias-send
(dropped); multiple sends to one reply/reply_demonitor alias in a batch
(first delivered, rest dropped).
OOM fault injection on mailbox_send_monitor_signal at nifs.c:5187 (if a
malloc hook is available) to lock in the bifs_hash.h not found #1 fix.

bettio changed the base branch from main to release-0.7 June 2, 2026 12:10

This comment was marked as outdated.

Sign in to view

github-advanced-security AI found potential problems Jun 5, 2026

View reviewed changes

Comment thread src/libAtomVM/nifs.c Fixed

Comment thread tests/test-enif.c Fixed

This comment was marked as outdated.

Sign in to view

bettio force-pushed the updated-aliases branch from ce79b10 to f6af023 Compare June 7, 2026 19:53

mat-hek and others added 23 commits June 10, 2026 10:33

Add process aliases to the changelog

37d6572

Signed-off-by: Davide Bettio <davide@uninstall.it>

bettio added 19 commits June 10, 2026 10:35

bettio force-pushed the updated-aliases branch from 64e7fb6 to 73bb408 Compare June 11, 2026 13:43

pguyot reviewed Jun 12, 2026

View reviewed changes

bettio added 7 commits June 12, 2026 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated aliases#2326

Updated aliases#2326
bettio wants to merge 62 commits into
atomvm:release-0.7from
bettio:updated-aliases

bettio commented Jun 2, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

Uh oh!

This comment was marked as outdated.

petermm commented Jun 8, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

pguyot Jun 12, 2026

Uh oh!

petermm commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bettio commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

Uh oh!

This comment was marked as outdated.

petermm commented Jun 8, 2026

PR Review — Process aliases (OTP ≥ 26 semantics)

Severity summary

1. Major — spawn_opt silently ignores a malformed {monitor, BadTerm}

2. Major — alias message can be delivered after an earlier DOWN that should have deactivated the alias

3. Minor — ETF process-reference parse accepts owner pid 0

4. Minor — term_compare word order for process refs differs from ETF / external refs

5. Minor — alias→normal conversion drops the message on OOM (reply_demonitor becomes non-transactional)

6. Nit — context_add_monitor frees an alias monitor through the embedded pointer

7. Pre-existing — monitor/3 installs the caller-side monitor after unlocking the target (DOWN race)

8. Pre-existing — mailbox_send_monitor_signal OOM is silently ignored

Positive notes

Suggested follow-up tests

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pguyot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

petermm commented Jun 12, 2026

PR Review — Process aliases for AtomVM

Verdict

Findings

1. (Medium) OOM in mailbox_send_monitor_signal leaves a half-published monitor/alias

2. (Low) Same OOM class on mailbox_send / mailbox_send_ref_signal

3. (Low, pre-existing) Registered-name DOWN writes a temp-heap tuple into the retained signal

4. (Low / portability) In-place re-type of TermSignal → Message

Things verified as correct (not bugs)

Suggested follow-up tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bettio commented Jun 2, 2026 •

edited

Loading

1. Major — `spawn_opt` silently ignores a malformed `{monitor, BadTerm}`

2. Major — alias message can be delivered after an earlier `DOWN` that should have deactivated the alias

3. Minor — ETF process-reference parse accepts owner pid `0`

4. Minor — `term_compare` word order for process refs differs from ETF / external refs

5. Minor — alias→normal conversion drops the message on OOM (`reply_demonitor` becomes non-transactional)

6. Nit — `context_add_monitor` frees an alias monitor through the embedded pointer

7. Pre-existing — `monitor/3` installs the caller-side monitor after unlocking the target (DOWN race)

8. Pre-existing — `mailbox_send_monitor_signal` OOM is silently ignored

1. (Medium) OOM in `mailbox_send_monitor_signal` leaves a half-published monitor/alias

2. (Low) Same OOM class on `mailbox_send` / `mailbox_send_ref_signal`

3. (Low, pre-existing) Registered-name `DOWN` writes a temp-heap tuple into the retained signal

4. (Low / portability) In-place re-type of `TermSignal` → `Message`