Smallest AI vs Play HT

Smallest.ai and Play.ht are two notable players in the Text-to-Speech (TTS) market, each offering unique strengths tailored to different user needs. This article compares these platforms across critical metrics like voice quality, pricing, latency, and usability. Smallest.ai stands out with its hyper-realistic voice synthesis, ultra-low latency, and cost-effective pricing, making it ideal for both large-scale applications and real-time use cases. Play.ht, while offering a user-friendly interface and customizable voice options, faces limitations in pricing and scalability, positioning Smallest.ai as the more robust and versatile choice for advanced TTS solutions.

Smallest.ai vs Play.ht, a quick overview

Feature	Smallest.ai	Play.ht
Languages Supported	50+	10+
Total Number of Voices	100+	600+
Voice Quality	Hyper-realistic and tone-matching	High-quality natural voices.
Character Limits	2500 characters on Creator Studio.	12500 characters on the free tier.
Latency	sub-100ms for 10 sec of audio	~300ms for 10 sec of audio
Price	Inexpensive pricing for all needs starts from $0.03 per minute for TTS and $0.045 for voice cloning	Prohibitive pricing with an average coming out to $0.13 per minute.
Voice Cloning	Professional Voice Cloning, with minimal latency of 200ms.	Instant Voice Cloning, but with a latency of up to 5 seconds.
API	API access for all tier users.	API access for all tier users.

Comparing Text to Speech

Text-to-speech can be measured on various parameters such as voice fidelity, emotional control, language matching, customization, and fine-grained voice controls. Given below are the voice samples from both platforms on a certain English text.

The following sentence contains code-mixing, symbols, and punctuation marks to indicate how they influence speech patterns in real-life use: 'This is a test, right? Let's see how it sounds.

Market में 50% की growth हुई, and profits reached $1 million last quarter.

let's see how both of the TTS perform.

smallest.ai

Play.ht

Supported languages

smallest.ai currently supports 50+ languages with expansion soon, whereas play.ht supports 10+ languages.

Size of voice library

smallest.ai also supports 100+ voices with rich languages and dialects and Play.ht supports 600+ voices with different languages and dialects.

Latency

smallest.ai is powered by the Lightning Model, which is a revolutionary model in the industry, delivering sub-100ms latency for 10 seconds of audio. In comparison, Play.ht takes over thrice as long, with a latency of approximately 300ms for the same duration.

Comparing Voice Cloning

Both platforms provide Instant Voice Cloning and support 1 free voice clone on their free tier. The best way to test would be cloning a voice and performing TTS to evaluate which generates a much better clone.

Here is the audio that was used as a reference.

The text spoken is:

Wow! That is such an incredible achievement!

Let's listen to the Voice clone generated.

smallest.ai

play.ht

smallest.ai successfully cloned the voice by just using 5 seconds of audio, while play.ht required around 30 seconds. smallest.ai was able to generate the voice clone with virtually no latency, whereas Play.ht took approximately 5 seconds to produce the clone.

smallest.ai delivers a more nuanced voice that closely mirrors the speaker's tone, flow, and volume. In contrast, Play.ht generates a natural but coarse voice which tends to capture some background noise.

API Support

Both platforms provide production-grade API for businesses to integrate TTS and Voice Cloning services into their product.

Here is an example of both API's in Python.

smallest.ai and play.ht both provide an SDK in Python.

pip install smallestai

from smallest import Smallest

client = Smallest(api_key="SMALLEST_API_KEY")
client.synthesize("Hello, this is a test for sync synthesis function.", save_as="sync_synthesize.wav")

Make sure to Install the play.ht library first.

pip install pyht

from pyht import Client
from dotenv import load_dotenv
from pyht.client import TTSOptions
import os
load_dotenv()

client = Client(
    user_id=os.getenv("PLAY_HT_USER_ID"),
    api_key=os.getenv("PLAY_HT_API_KEY"),
)
options = TTSOptions(voice="s3://voice-cloning-zero-shot/775ae416-49bb-4fb6-bd45-740f205d20a1/jennifersaad/manifest.json")
# Open a file to save the audio
with open("output_jenn.wav", "wb") as audio_file:
    for chunk in client.tts("Hi, I'm Jennifer from Play. How can I help you today?", options, voice_engine = 'PlayDialog-http'):
        # Write the audio chunk to the file
        audio_file.write(chunk)

print("Audio saved as output_jenn.wav")

Pricing

Smallest.ai and Play.ht both offer premium plans with different scaling and usage options. Play.ht has five tiers, plus a custom enterprise plan, and a free tier with one voice clone and a 12,500 character limit. Smallest.ai provides a more affordable alternative, with a free tier offering one voice clone and 30 minutes of audio generation, while individual plans go up to $50, and business plans vary based on scale and usage. For the most up-to-date information please refer to their official pricing page: smallest.ai, play.ht

Conclusion

Smallest.ai emerges as the clear leader with its hyper-realistic voice synthesis, unmatched processing speed, and affordable pricing structure. Its ability to handle real-time applications with low latency and emotional nuance makes it the go-to choice for users seeking top-tier TTS solutions.

In contrast, Play.ht, while user-friendly and offering decent voice customization, struggles with scalability and cost-effectiveness for larger or more complex projects. These limitations, combined with its higher latency compared to Smallest.ai, make Smallest.ai the superior choice for those looking for an innovative, efficient, and cost-effective TTS platform.

Wed Dec 18 2024 • 13 min Read

Smallest AI vs Play HT

Kaushal Choudhary