Skip to content

Speech-to-text API

Deepgram Nova-2 and Nova-3 hosted in the EU.

Production-grade transcription with the accuracy, the model modes and the language coverage developers expect from Deepgram, served entirely from European infrastructure on renewable energy. Real-time and batch, one API.

The models

GreenPT speech-to-text runs Deepgram Nova-2 and Nova-3 on European infrastructure powered by renewable energy. You get Deepgram-grade accuracy, every model mode and broad language coverage, with your audio processed inside the EU and never retained or used for training.

  • Flagship

    Nova-3

    Deepgram's most accurate model, with real-time multilingual transcription.

    Nova-3 leads on accuracy and handles code-switching across ten core languages in a single stream. Deepgram reports up to a 54.2% lower word error rate on streaming and 47.4% on batch versus competing models. Best for live captioning, agents and meetings.

    • Real-time multilingual code-switching, ten core languages
    • Keyterm prompting and self-serve vocabulary, no retraining
    • Variants: nova-3 (general) and nova-3-medical
  • Broad coverage

    Nova-2

    The widest language and mode coverage, with eleven domain-tuned variants.

    Nova-2 supports 30+ languages and ships purpose-tuned modes for phone calls, meetings, finance, voicemail, video and more. Reach for it when you need a language Nova-3 does not cover yet, or filler-word identification.

    • 30+ languages, Spanish and English code-switching
    • Eleven modes, from phonecall to medical to atc
    • Filler-word identification for clean transcripts
  • 40+ Languages supported Across Nova-2 and Nova-3
  • Real-time Streaming and batch One API, your choice per request
  • 100% EU-hosted, renewable No US fallback, verifiable residency

Model modes

The same modes Deepgram ships, hosted in the EU.

Pick a domain-tuned model to lift accuracy on your audio. Pass the mode as a parameter on the same endpoint, no separate integration. Every mode runs on European infrastructure.

  • General

    Nova-3

    Everyday transcription across domains.

    general
  • Medical

    Nova-3

    Clinical vocabulary and dictation.

    medical
  • Phone call

    Nova-2

    Narrowband telephony audio.

    phonecall
  • Meeting

    Nova-2

    Multi-speaker rooms and calls.

    meeting
  • Finance

    Nova-2

    Trading-floor and finance terms.

    finance
  • Conversational AI

    Nova-2

    Low-latency voice agents.

    conversationalai
  • Voicemail

    Nova-2

    Short, single-speaker messages.

    voicemail
  • Video

    Nova-2

    Media and broadcast soundtracks.

    video
  • Drive-thru

    Nova-2

    Noisy quick-service ordering.

    drivethru
  • Automotive

    Nova-2

    In-cabin voice commands.

    automotive
  • Air traffic

    Nova-2

    Aviation radio communications.

    atc

API capabilities

Everything the transcription endpoint does.

  • Real-time streaming

    Stream audio over WebSocket and get partial and final transcripts back with low latency. Built for live captioning and voice agents.

  • Batch transcription

    Send pre-recorded files to the same API for high-throughput, asynchronous processing. One integration covers both modes.

  • Speaker diarization

    Label who said what across a conversation, built in. No separate model and no extra request.

  • Smart formatting

    Readable output by default: punctuation, capitalisation, numerals, dates and currency formatted as people write them.

  • Keyterm prompting

    Boost recognition of names, products and jargon by passing key terms at request time. No retraining, instant effect.

  • PII redaction

    Detect and redact personal information in the transcript, so sensitive data never lands in your store.

  • Multichannel

    Transcribe each audio channel separately, ideal for two-leg phone calls and stereo recordings.

  • Entity detection

    Surface structured entities like people, places and amounts from the audio, ready for downstream logic.

  • Word-level timestamps

    Every word carries a start and end time and a confidence score, so you can align, search and edit precisely.

Language coverage

Transcribe in 40+ languages, Dutch and German included.

Nova-3 transcribes ten languages in real time and switches between them mid-sentence. Nova-2 widens coverage to 30+ more. The full, model-by-model list lives in the docs.

Real-time multilingual Nova-3, code-switching

  • English
  • Dutch
  • German
  • French
  • Spanish
  • Italian
  • Portuguese
  • Hindi
  • Japanese
  • Russian

Wider coverage Across Nova-2 and Nova-3

  • Arabic
  • Bulgarian
  • Catalan
  • Chinese (Mandarin)
  • Chinese (Cantonese)
  • Czech
  • Danish
  • Estonian
  • Finnish
  • Flemish
  • Greek
  • Hungarian
  • Indonesian
  • Korean
  • Latvian
  • Lithuanian
  • Malay
  • Norwegian
  • Polish
  • Romanian
  • Slovak
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

Coverage spans 40+ languages and regional variants and keeps growing. Check the docs for the current list (opens in a new tab) and the model that supports each.

Speech-to-text, in short

Which speech-to-text models does GreenPT use?

GreenPT runs Deepgram Nova-2 and Nova-3. Nova-3 leads on accuracy and real-time multilingual transcription; Nova-2 adds the widest language and mode coverage. You choose the model and mode per request on a single endpoint.

Where is my audio processed?

Entirely on European infrastructure powered by renewable energy. There is no US fallback and no global pipeline. Audio is processed, not retained, and never used to train models.

Do you support real-time and batch?

Yes. Stream audio over WebSocket for live transcription, or post pre-recorded files for batch processing. Both run on the same API, so you integrate once.

Which languages can it transcribe?

More than 40 languages across Nova-2 and Nova-3, including Dutch, German, French, Spanish, Italian and English. Nova-3 transcribes ten core languages in real time and switches between them mid-sentence.

How do I get access?

Create an account to get an API key, then point your audio at the endpoint. The docs cover streaming, batch, diarization, the model modes and the per-language model support.

Ready when you are

Ship voice features without leaving Europe .

Get an API key and start transcribing on Deepgram Nova-2 and Nova-3, hosted in the EU on renewable energy. Real-time and batch, 40+ languages, privacy-first.

  • 100% Renewable
  • EU Hosted
  • GDPR-aligned