Use virtual speaker and microphone hardware with models such as speech-to-text and text-to-speech to speak, listen, and respond to audio in an active Workstation.
📄️ Prompt
Send a system-level prompt to the AI agent in the Workstation to perform a task.
📄️ Browser Prompt
Send a browser-related prompt to the AI agent in the Workstation to perform a task.
📄️ Voice Speak
Play voice audio into the virtual microphone via a text-to-speech model. You must provide the exact copy for the agent to speak. The audio is played in realtime via byte streaming to reduce latency.
📄️ Voice Listen
Listen starts a session to listen for voice-based audio from the virtual speakers
📄️ Voice Question
A helper to speak text to speech audio into the virtual microphone and then after listening for a user response. It is expected that you have already called the `VoiceListen` endpoint to listen/consume the user response.