🐬 DolphinGemma: Google’s AI Tries to Understand the Language of Dolphins
Google’s DolphinGemma is helping scientists decode dolphin communication using LLMs and acoustic modeling.

Akshat Mandloi
Updated on
December 18, 2025 at 12:50 PM
Introduction: Language Beyond Humanity
We’ve taught machines to speak like humans. But can they help us understand non-human languages? That’s the ambition behind DolphinGemma, a cutting-edge collaboration between Google DeepMind, marine biologists, and acoustic engineers to decode the communication system of dolphins using large language models.
For decades, researchers have known that dolphins—especially Atlantic spotted dolphins (Stenella frontalis)—use complex sequences of whistles, clicks, and burst pulses to interact. What’s remained elusive is whether this amounts to a true “language,” and if it can be translated. DolphinGemma may offer the first concrete tools to make that possible.
🧪 The Science Behind DolphinGemma
Developed by Google and open-sourced under the Gemma LLM family, DolphinGemma adapts foundational principles of LLMs—sequence prediction, contextual embedding, and self-supervised learning—to the acoustic domain of dolphin vocalizations.
Key Techniques:
Waveform-level encoding: Converts dolphin click patterns into numerical representations.
Sequence modeling: Detects recurring acoustic patterns in "whistle chains" and click trains.
Contextual prediction: Learns probable follow-up sounds based on behavioral labels from researchers.
This is not a simple transcription model. DolphinGemma works more like an unsupervised language learner, modeling meaning from sound relationships over time—akin to how humans acquire language.
🎧 The Dataset: 9+ Terabytes of Dolphin Dialogues
The foundation of this model is one of the largest datasets ever collected of wild dolphin sounds. Thanks to the Wild Dolphin Project (WDP)—which has tracked a single dolphin community in the Bahamas since 1985—researchers had access to over:
1000+ hours of underwater recordings
Paired behavioral context logs (like mating, feeding, or aggression)
High-fidelity stereo hydrophone arrays
This made it possible to not only localize sound sources but correlate sounds with behaviors, which is crucial to building a semantic framework for non-human language.
🧠 Model Capabilities: What DolphinGemma Can Actually Do
As of its latest evaluation (March 2025), DolphinGemma has demonstrated:
High precision in classifying click types: It distinguishes over a dozen click variants previously indistinguishable to the human ear.
Sequence prediction: Accurately predicts next vocalizations in a sequence with over 74% accuracy on test datasets.
Proto-semantic clustering: Suggests clustering of acoustic forms based on interaction context (e.g., “approach call” vs “aggression warning”).
While it’s far from a Rosetta Stone, these are essential first steps toward mapping intent to sound.
🌍 Real-World Implications: From Conservation to AI Generalization
Beyond the curiosity of interspecies communication, DolphinGemma has serious implications:
1. Conservation Biology
By better understanding dolphin signals, we can monitor stress, mating behaviors, and migrations—all without intrusive tagging or sonar disruption.
2. AI Generalization
If LLMs can model non-human communication, they can be repurposed for:
Ancient script translation (like Linear A)
Non-verbal patient communication in healthcare
Signal intelligence in defense
3. Understanding Language Origins
Studying cetacean communication systems gives insight into how language might emerge naturally—informing theories in linguistics, cognitive science, and even AI alignment.
🔍 What Experts Are Saying
“DolphinGemma is one of the first serious efforts to use generative AI for non-human signal interpretation,” says Dr. Denise Herzing, founder of the Wild Dolphin Project.
“It’s the frontier of bioacoustic linguistics,” notes Dr. Florian Metze, an AI speech researcher at Carnegie Mellon.
🛠️ Open Source + Next Steps
DolphinGemma is part of Google’s broader push to open-source frontier AI tools for academic research. The DolphinGemma model weights, training data references, and preprocessing pipelines are expected to be published under a research use license on ai.google.dev and GitHub later this year.
📚 References and Citations
Wild Dolphin Project
✍️ Final Thoughts: Why This Matters
As someone who works in AI and language modeling, watching LLMs leap from token sequences to bioacoustic modeling is nothing short of astonishing. We’re no longer confined to translating between human languages—we’re pushing into an entirely new frontier: understanding minds that evolved entirely apart from us.
DolphinGemma is proof that machine learning isn’t just about automation. It’s about connection.
📌 Want More Like This?
🧬 Sign up for our weekly digest on AI, cognition, and the intersection of tech + nature.
📚 References and Citations
Introduction: Language Beyond Humanity
We’ve taught machines to speak like humans. But can they help us understand non-human languages? That’s the ambition behind DolphinGemma, a cutting-edge collaboration between Google DeepMind, marine biologists, and acoustic engineers to decode the communication system of dolphins using large language models.
For decades, researchers have known that dolphins—especially Atlantic spotted dolphins (Stenella frontalis)—use complex sequences of whistles, clicks, and burst pulses to interact. What’s remained elusive is whether this amounts to a true “language,” and if it can be translated. DolphinGemma may offer the first concrete tools to make that possible.
🧪 The Science Behind DolphinGemma
Developed by Google and open-sourced under the Gemma LLM family, DolphinGemma adapts foundational principles of LLMs—sequence prediction, contextual embedding, and self-supervised learning—to the acoustic domain of dolphin vocalizations.
Key Techniques:
Waveform-level encoding: Converts dolphin click patterns into numerical representations.
Sequence modeling: Detects recurring acoustic patterns in "whistle chains" and click trains.
Contextual prediction: Learns probable follow-up sounds based on behavioral labels from researchers.
This is not a simple transcription model. DolphinGemma works more like an unsupervised language learner, modeling meaning from sound relationships over time—akin to how humans acquire language.
🎧 The Dataset: 9+ Terabytes of Dolphin Dialogues
The foundation of this model is one of the largest datasets ever collected of wild dolphin sounds. Thanks to the Wild Dolphin Project (WDP)—which has tracked a single dolphin community in the Bahamas since 1985—researchers had access to over:
1000+ hours of underwater recordings
Paired behavioral context logs (like mating, feeding, or aggression)
High-fidelity stereo hydrophone arrays
This made it possible to not only localize sound sources but correlate sounds with behaviors, which is crucial to building a semantic framework for non-human language.
🧠 Model Capabilities: What DolphinGemma Can Actually Do
As of its latest evaluation (March 2025), DolphinGemma has demonstrated:
High precision in classifying click types: It distinguishes over a dozen click variants previously indistinguishable to the human ear.
Sequence prediction: Accurately predicts next vocalizations in a sequence with over 74% accuracy on test datasets.
Proto-semantic clustering: Suggests clustering of acoustic forms based on interaction context (e.g., “approach call” vs “aggression warning”).
While it’s far from a Rosetta Stone, these are essential first steps toward mapping intent to sound.
🌍 Real-World Implications: From Conservation to AI Generalization
Beyond the curiosity of interspecies communication, DolphinGemma has serious implications:
1. Conservation Biology
By better understanding dolphin signals, we can monitor stress, mating behaviors, and migrations—all without intrusive tagging or sonar disruption.
2. AI Generalization
If LLMs can model non-human communication, they can be repurposed for:
Ancient script translation (like Linear A)
Non-verbal patient communication in healthcare
Signal intelligence in defense
3. Understanding Language Origins
Studying cetacean communication systems gives insight into how language might emerge naturally—informing theories in linguistics, cognitive science, and even AI alignment.
🔍 What Experts Are Saying
“DolphinGemma is one of the first serious efforts to use generative AI for non-human signal interpretation,” says Dr. Denise Herzing, founder of the Wild Dolphin Project.
“It’s the frontier of bioacoustic linguistics,” notes Dr. Florian Metze, an AI speech researcher at Carnegie Mellon.
🛠️ Open Source + Next Steps
DolphinGemma is part of Google’s broader push to open-source frontier AI tools for academic research. The DolphinGemma model weights, training data references, and preprocessing pipelines are expected to be published under a research use license on ai.google.dev and GitHub later this year.
📚 References and Citations
Wild Dolphin Project
✍️ Final Thoughts: Why This Matters
As someone who works in AI and language modeling, watching LLMs leap from token sequences to bioacoustic modeling is nothing short of astonishing. We’re no longer confined to translating between human languages—we’re pushing into an entirely new frontier: understanding minds that evolved entirely apart from us.
DolphinGemma is proof that machine learning isn’t just about automation. It’s about connection.
📌 Want More Like This?
🧬 Sign up for our weekly digest on AI, cognition, and the intersection of tech + nature.
📚 References and Citations
Related Blogs
Lead with Precision, Speak with Purpose: What Smallest.ai Shares with Emmanuel Macron
Nov 25, 2025
Conversational AI in Customer Service: 4 Use Cases And Steps
Dec 18, 2025
The Future of AI in Customer Service: What Comes Next
Dec 18, 2025
9 Ways Contact Center AI Is Changing Customer Calls Forever
Dec 18, 2025
How Generative AI in Financial Services is Defining 2025 ROI
Dec 18, 2025

