Agents

Models

Resources

Pricing

Contact Sales

Customers / Smallest

Every Word, On the Record How Pocket Transcribes 12M Minutes a Month

9.63% WER

in the real world against Deepgram's ~28%

10 languages

across all 16 markets

~12M min/mon

minutes/month of audio transcribed

The Problem

With 86,700+ customers on the platform, Pocket records first and transcribes after. That choice puts all the weight on one thing: the quality of the final transcript.

Pocket records in the real world. Far-field rooms, phone calls, several people talking at once. Generic speech-to-text falls apart exactly there: wrong names, missing words, no sense of who said what.

And the stakes are high. Financial calls, board meetings, personal calls carry sensitive data that has to be redacted, not just transcribed. Users span 10 languages across 16 markets, so one model has to hold accuracy everywhere.

A transcript users can't trust breaks everything downstream: the summary, the action items, the follow-up.

The Solution

Pocket runs Smallest AI's Pulse STT as the transcription layer in its batch pipeline. Because Pocket processes complete files rather than live streams, it uses Pulse in pre-recorded mode, where accuracy is highest.

One pass returns everything: transcript, speaker labels, PII and PCI redaction, punctuation, timestamps, and noise handling. Automatic language detection covers all 10 market languages. Nothing to stitch together as volume grows.

The Results

The following results are;

9.63% WER in the real world vs Deepgram's ~28%
Every transcript ships with speaker labels, redaction, punctuation, and timestamps in a single pass

Pulse holds accuracy across all of Pocket's market languages:

And it stays ahead as conditions degrade. WER by noise band:

Batch full-file processing scales with volume with no per-stream limits, so quality stays consistent across every market and use case.

Building something that depends on accurate transcription? See the Pulse model card or talk to our team.

Company name

Pocket

Industry

AI Hardware / Productivity Tech

Company size

SMB

Products used

Speech to text (Pulse)

About the company

Pocket, built by Open Vision Engineering, is a wearable capture device. It records conversations, calls, and meetings, then processes the full audio file in batch into a clean transcript and summary. Therapists, realtors, sales reps, and founders use it for hands-free capture across 16 Western markets.

Build the future of voice agent orchestration

Contact sales

311 California Street, Suite 320
San Francisco, CA 94104

Models

Text to Speech

Speech to Text

Speech to Speech

Voice cloning

Agents

Overview

On Prem

Industries

Debt Collection

Healthcare

Real Estate

Small business

E-commerce

Documentation

For Agents

For Models

Resources

Pricing

Blogs

Research

Careers

Voice AI apps

Integrations

Dictionary

Press kit

Initiatives

Startup Grants

Legals

Privacy notice

Terms and conditions

Data processing

User Policy

TCPA compliance

Twitter

Instagram

Youtube

Discord

Substack

Medium

System status operational

We are

SOC 2,

GDPR, and

HIPAA, Compliant

Build the future of voice agent orchestration

Contact sales

311 California Street, Suite 320
San Francisco, CA 94104

Models

Text to Speech

Speech to Text

Speech to Speech

Voice cloning

Agents

Overview

On Prem

Industries

Debt Collection

Healthcare

Real Estate

Small business

E-commerce

Legals

Privacy notice

Terms and conditions

Data processing

User Policy

TCPA compliance

Documentation

For Agents

For Models

Resources

Pricing

Blogs

Research

Careers

Voice AI apps

Integrations

Dictionary

Press kit

Initiatives

Startup Grants

Twitter

Instagram

Youtube

Discord

Substack

Medium

System status operational

We are

SOC 2,

GDPR, and

HIPAA, Compliant

Build the future of voice agent orchestration

Contact sales

311 California Street, Suite 320
San Francisco, CA 94104

Models

Text to Speech

Speech to Text

Speech to Speech

Voice cloning

Agents

Overview

On Prem

Industries

Debt Collection

Healthcare

Real Estate

Small business

E-commerce

Documentation

For Agents

For Models

Resources

Pricing

Blogs

Research

Careers

Voice AI apps

Integrations

Dictionary

Press kit

Initiatives

Startup Grants

Legals

Privacy notice

Terms and conditions

Data processing

User Policy

TCPA compliance

Twitter

Instagram

Youtube

Discord

Substack

Medium

System status operational

We are

SOC 2,

GDPR, and

HIPAA, Compliant