feat: Add extension support for EZ-AI TW TTS by samx81 · Pull Request #2100 · TEN-framework/ten-framework

samx81 · 2026-03-11T02:30:45Z

Summary

Add extension for EZ-AI TW TTS.
Implemented to satisfy specific vendor requirements for the Taiwanese user base.

Type of Change

Testing

Tests added/updated
All tests pass
Manual testing completed

Documentation

Documentation updated
Examples provided if needed

Breaking Changes

No Breaking Changes

github-actions · 2026-06-30T11:38:14Z

Code Review — EZ-AI TW TTS extension (ezai_tw_tts_python)

Thanks for the contribution. The extension follows the AsyncTTS2BaseExtension pattern well and the request lifecycle (start → audio data → end → usage metrics) is handled cleanly. I found a few issues worth addressing before merge.

Blocking bugs

to_str() signature mismatch — crashes on init. config.py defines def to_str(self) -> str: (no args), but extension.py calls self.config.to_str(sensitive_handling=True). This raises TypeError on every init, which is swallowed and reported as a fatal init error. Match the sibling extensions (e.g. gradium_tts_python) and accept sensitive_handling: bool = True.
speed, denoise, zh_model are read from the wrong place. update_params() only promotes url, voice, sample_rate, channels, sample_width out of self.params. But request_tts builds the payload with getattr(self.config, 'speed', 0.8), getattr(self.config, 'denoise', False), getattr(self.config, 'zh_model', ''). So: speed is never a declared field and never promoted (always falls back to 0.8 — params.speed in property.json is ignored); denoise and zh_model are declared fields but update_params() never copies the params values into them, so config-file values are ignored and field defaults win. Net effect: three documented tuning params don't take effect. Pick one source of truth and promote all three in update_params().
denoise default contradicts itself. config.py default is True, property.json says false, payload fallback is False. Reconcile.

Correctness / robustness

cancel_tts can double-send audio_end. cancel_tts sends audio_end(INTERRUPTED) + usage metrics; the in-flight request_tts loop then breaks on the cancel event and its finally block sends a second audio_end(REQUEST_END) + metrics for the same request_id. Guard the finally path with current_request_finished / the cancel event.
Per-segment blocking HTTP hurts latency. Each sentence is a separate synchronous requests.post in asyncio.to_thread, fully buffered before yielding, with a hardcoded 60s timeout. For multi-sentence input this serializes round-trips and inflates TTFB. Consider an async HTTP client (other extensions use aiohttp/websockets) and make the timeout configurable.
Module-level heavy init. opencc.OpenCC(...), ZhNormalizer(), SentenceSegmenter(...) run at import time. Move into on_init (consider asyncio.to_thread) so load failures surface through the normal init error path and don't block the event loop.
Fire-and-forget dump writes. asyncio.create_task(...write(frames)) is never awaited or tracked — tasks can be GC'd, write out of order, or race the flush(). Await or collect-then-await before flush.
Silent drops on bad params. The except (TypeError, ValueError) branches del the key without setting the field or logging. Add a warning so bad property files are debuggable.

Style / packaging

Unpinned + git dependency. requirements.txt pins nothing and pulls git+https://github.com/samx81/text_utils.git from an unpinned branch. Per repo conventions, pin versions (and pin the git dep to a tag/SHA). The personal-repo git dependency is a supply-chain / reproducibility concern for a vendored extension — worth deciding whether it should be vendored or moved under the org.
README copy-pasted from vibevoice. Title/body still say vibevoice_tts_websocket_python; docs describe a websocket endpoint but the code does HTTP POST over https; the example JSON has a trailing comma after zh_model (invalid JSON); params.speed is documented but not wired up (bug 2). Please regenerate for this extension.
Leftover artifacts. Dump prefix is vibevoice_dump_{id}.pcm (should be ezai-specific); commented-out zh_model line; manifest includes **.tent and BUILD.gn that don't exist here; manifest.json and property.json are missing trailing newlines.
No tests. No tests/ dir; checklist has tests unchecked. Sibling HTTP TTS extensions (e.g. cartesia_tts) ship test_basic / test_params / test_error_msg / test_robustness. Bugs 1 and 2 would have been caught by a basic init/params test — please add at least smoke + params round-trip coverage.

Security

Default url points at matcha.ezai-k8s.freeddns.org — confirm this is the intended public default, not an internal host.
No auth/token handling. If the endpoint needs a key later, wire it through params with sensitive masking in to_str like the other extensions.

Overall the structure is solid and close, but bugs 1 and 2 mean the extension won't initialize and won't honor most of its configuration as written. A focused pass on config plumbing plus a small test would get this in good shape.

Sam and others added 2 commits March 10, 2026 18:34

feat: add ezai taiwanese tts extension

90ab880

Merge branch 'TEN-framework:main' into feat/twtts_ext

3f993ed

samx81 requested review from halajohn and plutoless as code owners March 11, 2026 02:30

Merge branch 'main' into feat/twtts_ext

466d105

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add extension support for EZ-AI TW TTS #2100

feat: Add extension support for EZ-AI TW TTS #2100
samx81 wants to merge 3 commits into
TEN-framework:mainfrom
samx81:feat/twtts_ext

samx81 commented Mar 11, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

samx81 commented Mar 11, 2026

Summary

Type of Change

Testing

Documentation

Breaking Changes

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants