🚀 RoboML v0.3.1 — More Serving Options and More Models

We’re excited to announce the release of RoboML v0.3.1, a major step forward in making open-source ML models easier and faster to deploy for robotics applications. This update focuses on real-time performance, multimodal interaction, and expanding RoboML's capabilities across speech, vision, and planning.

✨ What's New

🔌 Real-Time Interaction with WebSockets

RoboML’s HTTP server now supports WebSocket endpoints, enabling bi-directional communication and real-time streaming for responsive robotic applications.

🧠 Streaming LLM/MLLM Support

Language and multimodal models (like 🤗 Transformers and MLLMs) now support streaming outputs — perfect for interactive use cases such as instruction following and human-robot dialogue.

🧭 New Planning Model: RoboBrain2.0 by BAAI

We're thrilled to integrate RoboBrain2.0 — a state-of-the-art spatial-temporal reasoning model that enables complex planning with closed-loop feedback and real-time scene understanding.

🔊 New Voices, New Choices: TTS Upgrades

Two powerful TTS models have been added:

Bark by SunoAI: Natural, expressive voice generation.
MeloTTS by MyShell: High-quality multilingual synthesis (EN, ZH, JP, and more).

Also, Whisper STT has been upgraded to the FasterWhisper backend — delivering faster and more accurate transcriptions.

🛠️ Improvements & Fixes

Numerous performance enhancements, bug fixes, and stability improvements have been made across the board.

⚠️ Breaking Changes

Vector database and encoding model support has been removed to simplify the stack and focus on deployable, robotics-ready models.

📖 Full Changelog

See the full list of changes here:
🔗 0.2.3 → 0.3.1

RoboML is growing as a unified platform for deploying multimodal ML in real-world robotics. With real-time capabilities, speech support, and planning models, v0.3.1 brings us one step closer to true embodied intelligence.

Stay tuned, and happy deploying! 🤖💬📦

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.3.1