Releases: benoitc/erlang-python
3.0.0
3.0.0 (2026-05-03)
Breaking Changes
-
Simplified execution model - Only two public execution modes:
workerandowngilworker: Dedicated pthread per context with stable thread affinity (default)owngil: Dedicated pthread + subinterpreter with own GIL (Python 3.14+)- Removed
multi_executorandfree_threadedfrom public API - Internal capability detection still tracks Python features
-
Removed
py:num_executors/0- Contexts now use per-context worker threads
instead of a shared executor pool. This function is no longer needed. -
py:execution_mode/0returnsworker | owngil- Based on thecontext_mode
application configuration. Previously returned internal capabilities like
free_threaded,subinterp, ormulti_executor. -
Removed
py:async_stream/3,4- Streaming async generators was never
implemented behind the API and always returned{error, stream_not_implemented}.
Usepy:stream_start/3,4for sync generators; async-generator support may
return in a later release. -
Removed
num_executors/num_async_workersconfiguration - Both keys
were no-ops after the v3.0 worker rework. Configure context count via
num_contextsand the rate-limit ceiling viamax_concurrent. -
Strict context-mode validation at the NIF boundary -
py_nif:context_create/1
now returns{error, {invalid_mode, Atom}}for anything other thanworker | owngil.
Previously, callers that bypassedpy_context(notablypy_reactor_context)
silently mapped any unknown atom — including legacyautoandsubinterp—
to worker mode. Code that relied on that loophole must passworker(or
owngil) explicitly.
Fixed
-
py:async_call/3,4+py:async_await/1,2round-trip - Previously the
await receive matched{py_response, _, _}while the event loop sent
{async_result, _, _}, causing every async call to silently time out.
Async calls now go directly throughpy_event_loop:create_taskand
py_event_loop:await. -
py:async_gather/1,2actually executes - Reimplemented as concurrent
async_callsubmission with sequentialasync_await. Returns
{ok, [Result1, ...]}on success or{error, {gather_failed, [{Idx, Reason}, ...]}}
if any call fails. The previous implementation returnedgather_not_implemented. -
Thread-callback flakes (issue #63) - Six layered defects in the
erlang.call/erlang.async_callplumbing could deliver wrong values to
the wrong caller under load. Reads now loop on partial/EINTR with a
monotonic deadline; sync writes use a single length-prefixed frame on a
dirty I/O scheduler with deadlined non-blocking writes; the sync wire
carries the originating callback id and the receiver discards mismatched
frames; the async pipe has one writer process per fd with an
atomics-bounded mailbox (?ASYNC_WRITER_MAX_QUEUE = 10000) and a
resumable nonblocking parser on the read end; workers that fail to
resync are unlinked from the pool, freed, and bounded by
MAX_POISONED_WORKERS = 64.
Documentation
- Audited every fenced code block in
README.mdanddocs/*.mdfor
current-API references. FixedPy_GIL_OWNtoPyInterpreterConfig_OWN_GIL
indocs/scalability.md, corrected themulti_executorfallback claim
indocs/migration.md, and repaired a brokenSharedDictexample in
docs/shared-dict.md. - New
test/coverage_audit.mdmaps every publicpy:*anderlang.*API
to its test suite. Added cases forpy:cast/4,py:async_gather/2, and
py:dup_fd/1so each documented API has a regression test. - New
scripts/lint_doc_snippets.escript(driven bymake lint-docsand
CI) statically validates every Erlangpy:Fn(/N)call and parses every
Python block in the docs. Snippets that intentionally show removed APIs
or REPL output opt out via<!-- skip-lint -->.
Changed
-
Per-context worker threads - Each context now gets its own dedicated pthread
that handles all Python operations. This provides stable thread affinity for
numpy/torch/tensorflow compatibility without needing a shared executor pool. -
Async NIF dispatch - Context operations use async NIFs with message passing
instead of blocking dirty schedulers. This improves concurrency under load. -
Request queue per context - Replaced single-slot request pattern with proper
request queues that support multiple concurrent callers. -
No global asyncio policy install on Python 3.14+.
asyncio.set_event_loop_policy
was deprecated in 3.14 and is removed in 3.16. The Erlang integration's run path
already usesloop_factory=(erlang.run/1,asyncio.Runner) so the global
policy was only a convenience for bareasyncio.run()insidepy:exec. We now
skip the install on 3.14+ to avoid the deprecation warning. On 3.14+ use
erlang.run(main)orasyncio.Runner(loop_factory=erlang.new_event_loop)
explicitly. Behavior on Python 3.9–3.13 is unchanged.erlang.install()raises
RuntimeErroron 3.14+ (still emits aDeprecationWarningand works on 3.12–3.13).
Removed
- Multi-executor pool (
g_executors[],multi_executor_start/stop) context_dispatch_call/eval/execfunctions (dead code)- References to
PY_MODE_MULTI_EXECUTORin context operations py_async_poollegacy gen_server (unused after async API rewire)priv/_erlang_impl/_ssl.py(SSLTransport,create_ssl_transport) had no
importer and was never wired into the asyncio event loop. Removed.- Internal
py_utilexportssend_response/3,normalize_timeout/1, and
normalize_timeout/2had no callers anywhere. Removed. The module is
marked@private; no external API changes. - Explicit
py:subinterp_*handle API removed.py:subinterp_create/0,
subinterp_destroy/1,subinterp_call/4,5,subinterp_eval/2,3,
subinterp_exec/2,subinterp_cast/4,subinterp_async_call/4,
subinterp_await/1,2, andsubinterp_pool_*are all gone. Use
py_context:new(#{mode => owngil})instead — it gives the same
parallelism with OTP supervision and automatic cleanup.
py:subinterp_supported/0(capability probe) andpy:parallel/1
(which routes through the context API) stay. - Internal
py_execution_mode_tcollapsed from 3 values to 2 (free_threaded
/gil);py_nif:execution_mode/0returnsfree_threaded | gilinstead
of the oldfree_threaded | subinterp | multi_executor. examples/reactor_owngil_example.erldeleted (called nonexistent
py:subinterp_reactor_*functions; pre-existing breakage).
v2.3.1
2.3.0
Removed
- ASGI/WSGI Support - The
py_asgiandpy_wsgimodules have been removedpy_asgi:run/4,5- ASGI application runnerpy_wsgi:run/3,4- WSGI application runner- For web framework integration, use
py:callwith event loop contexts or the Channel API - See Migration Guide for alternatives
Added
- SharedDict - Process-scoped shared dictionaries for cross-process state
py:shared_dict_new/0- Create a new SharedDictpy:shared_dict_get/2,3- Get value with optional defaultpy:shared_dict_set/3- Set key-value pairpy:shared_dict_del/2- Delete a keypy:shared_dict_keys/1- List all keyspy:shared_dict_destroy/1- Explicit cleanup- Python access via
erlang.SharedDictwith dict-like interface - Mutex-protected for concurrent access (~300k ops/sec)
- Pickle serialization for complex types
- See SharedDict documentation for details
v2.2.0
Added
-
OWN_GIL Mode - True parallel Python execution with Python 3.14+ subinterpreters. Each subinterpreter runs with its own GIL in a dedicated thread, enabling true parallelism for CPU-bound workloads.
-
Process-Bound Python Environments - Per-Erlang-process Python namespaces with isolated globals/locals that persist across calls.
-
Event Loop Pool -
py_event_loop_pooldistributes async tasks with scheduler-affinity routing. -
ByteChannel API - Raw byte streaming without term serialization. Ideal for HTTP bodies, file streaming, binary protocols.
-
PyBuffer API - Zero-copy buffer for WSGI input streams with file-like interface.
-
True streaming API -
py:stream_start/3,4andpy:stream_cancel/1for event-driven streaming from Python generators. -
erlang.whereis(name)- Lookup registered Erlang PIDs from Python. -
erlang.schedule_inline(callback)- Inline continuation scheduling. -
py:spawn_call/3,4,5- Fire-and-forget with result delivery. -
Explicit bytes conversion -
{bytes, Binary}tuple for round-trip safety. -
Import caching API -
py:import/1,2,py:add_import/1,2,py:add_path/1. -
Per-interpreter preload code - Execute code in new interpreters with inherited globals.
Fixed
- Channel notification for create_task
- Channel waiter race condition
- Event loop isolation and resource safety
- Python 3.14 venv activation
- OWN_GIL safety fixes (mutex leak, deadlock prevention, env validation)
Changed
py:castis now fire-and-forget (usepy:spawn_callfor results)- OWN_GIL requires Python 3.14+
- Removed auto-started io pool
- Removed py_event_router
- Config-based initialization for imports/paths
Performance
- Direct NIF channel operations (up to 1760x speedup)
- nif_process_ready_tasks optimization (~15% improvement)
See CHANGELOG.md for full details.
v2.1.0 - Async Task API
Added
-
Async Task API - uvloop-inspired task submission from Erlang
py_event_loop:run/3,4- Blocking run of async Python functionspy_event_loop:create_task/3,4- Non-blocking task submission with referencepy_event_loop:await/1,2- Wait for task result with timeoutpy_event_loop:spawn_task/3,4- Fire-and-forget task execution- Thread-safe submission via
enif_send(works from dirty schedulers) - See Async Task API docs
-
erlang.spawn_task(coro)- Spawn async tasks from sync and async contexts- Works where
asyncio.get_running_loop()fails - Returns
asyncio.Taskfor optional await/cancel
- Works where
-
Explicit Scheduling API - Control dirty scheduler release from Python
erlang.schedule(callback, *args)- Release scheduler, continue via Erlang callbackerlang.schedule_py(module, func, args, kwargs)- Release scheduler, continue in Pythonerlang.consume_time_slice(percent)- Check if NIF time slice exhaustedScheduleMarkertype for cooperative long-running tasks
-
Distributed Python Execution - Run Python across Erlang nodes
- Documentation and Docker-based demo
- See Distributed Execution docs
Changed
- Event Loop Performance
- Growable pending queue (256 to 16384)
- Snapshot-detach pattern to reduce mutex contention
- Callable cache (64 slots) avoids PyImport/GetAttr per task
- Task wakeup coalescing
Fixed
ensure_venvalways installs deps, even if venv existserlang.sleep()timing in sync contexttime()returns fresh value when loop not running- Handle pooling bugs in ErlangEventLoop
- Task wakeup race causing batch task stalls
v2.0.0
Highlights
- Dual Pool Support - Separate pools for CPU-bound and I/O-bound operations with registration-based routing
- Channel API - Bidirectional message passing between Erlang and Python (8x faster than Reactor for small messages)
- OWN_GIL Subinterpreter Thread Pool - True parallelism with Python 3.12+ subinterpreters
- Reactor API - FD-based protocol handling for building custom servers
- Virtual Environment Management - Automatic venv creation with
py:ensure_venv/2,3
Added
py:ensure_venv/2,3- Automatic venv creation and activationpy:dup_fd/1- Safe socket handoff from Erlang to Python- Dual pool support (
defaultandiopools) with registration-based routing - Channel API (
py_channel) for bidirectional message passing - OWN_GIL subinterpreter thread pool for true parallelism
erlang.reactormodule for FD-based protocol handling- ETF encoding for PIDs and References
erlang.send(pid, term)for fire-and-forget message passing- Audit hook sandbox blocking fork/exec operations
- Process-per-context architecture
Changed
py:call_asyncrenamed topy:cast- Unified
erlangPython module (removed separateerlang_asyncio) - Async worker backend replaced with event loop model
SuspensionRequirednow inherits fromBaseException
Deprecated
py_asgimodule - use Channel API or Reactor API insteadpy_wsgimodule - use Channel API or Reactor API instead
Removed
- Context affinity functions (
py:bind,py:unbind,py:is_bound,py:with_context,py:ctx_*) - Signal handling support in ErlangEventLoop
- Subprocess support in ErlangEventLoop
Fixed
- Reactor context extending erlang module in subinterpreters
- FD stealing and UDP connected socket issues
- Timer scheduling for standalone ErlangEventLoop
- Subinterpreter cleanup and thread worker re-registration
- ProcessError exception class identity in subinterpreters
See CHANGELOG.md for full details.
v1.8.1
Fixed
- ASGI scope caching bug - HTTP method was not treated as a dynamic field in the scope template cache. This caused incorrect method values when the same path was accessed with different HTTP methods (e.g., GET /path followed by POST /path would return method="GET" for both requests).
v1.8.0
Added
-
ASGI NIF Optimizations - Six optimizations for high-performance ASGI request handling
- Direct Response Tuple Extraction - Extract
(status, headers, body)directly without generic conversion - Pre-Interned Header Names - 16 common HTTP headers cached as PyBytes objects
- Cached Status Code Integers - 14 common HTTP status codes cached as PyLong objects
- Zero-Copy Request Body - Large bodies (≥1KB) use buffer protocol for zero-copy access
- Scope Template Caching - Thread-local cache of 64 scope templates keyed by path hash
- Lazy Header Conversion - Headers converted on-demand for requests with ≥4 headers
- Direct Response Tuple Extraction - Extract
-
erlang_asyncio Module - Asyncio-compatible primitives using Erlang's native scheduler
erlang_asyncio.sleep(delay, result=None)- Sleep using Erlang'serlang:send_after/3erlang_asyncio.run(coro)- Run coroutine with ErlangEventLooperlang_asyncio.gather(*coros)- Run coroutines concurrentlyerlang_asyncio.wait_for(coro, timeout)- Wait with timeouterlang_asyncio.wait(fs, timeout, return_when)- Wait for multiple futureserlang_asyncio.create_task(coro)- Create background task- Event loop functions:
get_event_loop(),new_event_loop(),set_event_loop(),get_running_loop()
-
Erlang Sleep NIF - Synchronous sleep primitive for Python
py_event_loop._erlang_sleep(delay_ms)- Sleep using Erlang timer- Releases GIL during sleep, no Python event loop overhead
-
Scalable I/O Model - Worker-per-context architecture
py_event_worker- Dedicated worker process per Python context- Combined FD event dispatch and reselect via
handle_fd_event_and_reselectNIF
-
New Test Suite -
test/py_erlang_sleep_SUITE.erlwith 8 tests
Performance
- ASGI marshalling optimizations - 40-60% improvement for typical ASGI workloads
- Eliminates event loop overhead for sleep operations (~0.5-1ms saved per call)
- Sub-millisecond timer precision via BEAM scheduler (vs 10ms asyncio polling)
- Zero CPU when idle - event-driven, no polling
See CHANGELOG.md for full details.
v1.7.1
v1.7.0
Added
-
Shared Router Architecture for Event Loops
- Single
py_event_routerprocess handles all event loops (both shared and isolated) - Timer and FD messages include loop identity for correct dispatch
- Eliminates need for per-loop router processes
- Handle-based Python C API using PyCapsule for loop references
- Single
-
Isolated Event Loops - Create isolated event loops with
ErlangEventLoop(isolated=True)- Default (
isolated=False): uses the shared global loop managed by Erlang - Isolated (
isolated=True): creates a dedicated loop with its own pending queue - Full asyncio support (timers, FD operations) for both modes
- Useful for multi-threaded Python applications where each thread needs its own loop
- See
docs/asyncio.mdfor usage and architecture details
- Default (