feat: add Mistral Voxtral TTS extension by TiagoAgora · Pull Request #2194 · TEN-framework/ten-framework

TiagoAgora · 2026-06-24T16:32:29Z

Summary

Adds a new TTS extension, mistral_tts_python, integrating Mistral's Voxtral text-to-speech via the OpenAI-compatible /v1/audio/speech endpoint.

Built on AsyncTTS2HttpExtension (HTTP TTS mode), following the extension development guide.
Requests a self-describing WAV stream and converts it to PCM16 mono on the fly (handles int16/int24/int32 and IEEE-float payloads); audio output at 24 kHz.
Forwards vendor params through unchanged (model, voice_id, ref_audio, …); API-key auth via the Authorization header.
Handles cancellation/flush, content-moderation/auth errors, and TTFB metrics.

Testing

Standalone unit tests pass in the dev container (tman install --standalone + tests/bin/start): 13/13 passing — covering config defaults/validation, URL/base_url resolution, headers, audio dump, flush/cancel, invalid-key handling, reconnect robustness, metrics, and the request state machine.

Live API validation against api.mistral.ai has not been run yet.

YiminW · 2026-06-29T13:42:09Z

HI Tiago, please run tts-guarder for your new tts extension and add all cases passed sanp here

TiagoAgora · 2026-06-29T16:01:58Z

TTS guarder results — `mistral_tts_python`

Ran task tts-guarder-test EXTENSION=mistral_tts_python CONFIG_DIR=tests/configs against the live Mistral / Voxtral API.

14 passed · 1 skipped · 1 failed

⏭️ test_subtitle_alignment — skipped; Voxtral exposes no word-level timing.

❌ test_interleaved_requests — traced to a pre-existing bug in ten_ai_base (tts2.py), not this extension. When finish_request() releases buffered interleaved requests it pre-sets _processing_request_id, so _process_input_queue() skips the QUEUED → PROCESSING transition. The later QUEUED → FINALIZING is then rejected as an invalid transition, so tts_audio_end is never emitted and the request queue stalls until timeout (log: Invalid state transition … queued -> finalizing). With a one-line fix — transition to PROCESSING when the dequeued item is still QUEUED — the test passes in ~30s. This affects every HTTP-TTS extension, so I've kept it out of this PR. Happy to open a separate ten_ai_base issue/PR.

Three issues the guarder surfaced in this extension are fixed in this branch:

httpx[http2] dependency — the client uses http2=True; without the h2 package it failed to initialize on every request.
dump_path added to the basic_audio_setting{1,2} test configs (the guarder reads config["dump_path"]).
mistral_tts_python added to the fixed-sample-rate allowlist in tests/bin/start (Voxtral emits a fixed 24 kHz, like openai_tts2 / humeai).

Add the mistral_tts_python TTS extension (AsyncTTS2HttpExtension) for Mistral's OpenAI-compatible /v1/audio/speech endpoint. Streams the WAV response and converts it to PCM16 mono at 24 kHz. Includes config, client, addon, manifest/property, README, and unit tests (13 passing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Switch the requested response_format from wav to pcm. Voxtral's pcm is a headerless float32 LE stream at 24 kHz mono, so the client now rescales each float32 sample to int16 (Float32ToPcm16) instead of parsing a WAV container. This lowers time-to-first-audio (no header to buffer) and drops the WAV chunk-parsing path. Non-finite samples map to silence so a corrupt stream can't crash conversion. Tests updated to stream headerless float32 pcm mocks.

- requirements: depend on httpx[http2] so the h2 package is installed (client uses http2=True; without h2 it failed to initialize on every request) - tests/configs: add dump/dump_path to basic_audio_setting1/2 (guarder reads config["dump_path"]) - tts_guarder: add mistral_tts_python to the fixed-sample-rate allowlist in tests/bin/start (Voxtral emits a fixed 24kHz, like openai/humeai)

…rement - .env.example: add MISTRAL_API_KEY and MISTRAL_TTS_VOICE with a note that the available voices vary by account (and how to list them) - README: the live cloud API requires a voice (or ref_audio); fix the example voice (casual_male does not exist) and document the ${env:MISTRAL_TTS_VOICE} used by the test configs

TiagoAgora requested review from halajohn and plutoless as code owners June 24, 2026 16:32

TiagoAgora mentioned this pull request Jun 24, 2026

feat: add Mistral Voxtral TTS extension TiagoAgora/ten-framework#1

Closed

YiminW reviewed Jun 25, 2026

View reviewed changes

Comment thread ai_agents/agents/ten_packages/extension/mistral_tts_python/config.py

TiagoAgora force-pushed the feat/mistral-voxtral-tts branch 2 times, most recently from b313a08 to 1064738 Compare June 29, 2026 13:07

Tiago Peres de Sousa and others added 6 commits June 30, 2026 15:57

docs: document HTTP TTS client contract and audio format gotcha

c9f8785

chore(mistral_tts): add pyproject.toml and include in publish list

0538a6b

TiagoAgora force-pushed the feat/mistral-voxtral-tts branch from 7114e00 to 0538a6b Compare June 30, 2026 14:57

wangyoucao577 approved these changes Jul 1, 2026

View reviewed changes

wangyoucao577 merged commit dc488ff into TEN-framework:main Jul 1, 2026
27 of 41 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Mistral Voxtral TTS extension#2194

feat: add Mistral Voxtral TTS extension#2194
wangyoucao577 merged 6 commits into
TEN-framework:mainfrom
TiagoAgora:feat/mistral-voxtral-tts

TiagoAgora commented Jun 24, 2026

Uh oh!

Uh oh!

YiminW commented Jun 29, 2026

Uh oh!

TiagoAgora commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

TiagoAgora commented Jun 24, 2026

Summary

Testing

Uh oh!

Uh oh!

YiminW commented Jun 29, 2026

Uh oh!

TiagoAgora commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TTS guarder results — mistral_tts_python

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TiagoAgora commented Jun 29, 2026 •

edited

Loading

TTS guarder results — `mistral_tts_python`