/

Pipecat

Pipecat x Smallest AI Voice

Voice-enable your pipelines with Pulse speech-to-text and Lightning text-to-speech

OVERVIEW

Real-time speech, native to Pipecat.

Pipecat is the open-source Python framework for building real-time voice and multimodal AI agents. Smallest AI as first-class STT and TTS services, so your agent hears and speaks through models built for low-latency production conversation — no custom wrappers, no glue code. Pulse handles what the caller says: real-time speech-to-text transcription over a WebSocket integration with the Waves API, streaming audio continuously and returning interim and final results with low latency. Lightning handles what the agent says back: real-time synthesis over WebSocket streaming, with configurable voice parameters and multiple languages, reconnecting cleanly to handle interruptions.

HOW IT WORKS

Up and running in five steps.

Step 1 illustration