// use cases · voice

Voice AI that
stays in Europe.

Speech-to-text, voicebots and text-to-speech from one provider on EU infrastructure, 99+ languages, sub-second synthesis.

// how it works

The full voice loop, one provider.

Transcription, an LLM and speech synthesis from a single OpenAI-compatible endpoint — so the audio takes one short trip, and only inside the EU.

step 01

Transcribe

whisper-large-v3

Turn calls and audio into text — 99+ languages, 3.2% WER on Spanish, up to 25MB per file. Recordings are processed only on EU infrastructure.

step 02

Understand & respond

deepseek-v4-flash

Summarize, route, answer or drive a voicebot with an LLM over the transcript — tool calling included, so the conversation actually does something.

step 03

Speak

kokoro

Synthesize natural speech in under a second — 67 voices, Spanish included — for real-time voicebots, IVR and accessibility.

// drop-in

Change one line. Keep your stack.

The OpenAI audio endpoints — transcriptions and speech — work as-is. Change the base URL and key and your existing voice code runs on private EU models.

read_the_docs
voice.py
from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="https://api.helmcode.com/v1",  # one line changes
)

# 1 · transcribe a call — 99+ languages, stays in the EU
text = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("call.mp3", "rb"),
)

# 2 · synthesize the reply — sub-second, 67 voices
speech = client.audio.speech.create(
    model="kokoro",
    voice="alba",
    input=reply,
)

// why helmcode

Voice without the privacy headache.

Recordings are the most sensitive data you hold — full of PII, and a regulator's favourite. Voice on Helmcode keeps all of it in the EU.

01

Recordings never leave.

Calls, transcripts and synthesized audio are never stored, and never train a model. The PII inside a recording stays your problem to no one.

02

Processed in the EU.

Speech-to-text, the LLM and text-to-speech all run on EU infrastructure — not on US hyperscalers subject to the Cloud Act. GDPR and AI Act native.

03

STT + LLM + TTS, one API.

The full voice stack — transcription, reasoning and synthesis — behind a single OpenAI-compatible endpoint. One vendor, one bill, one network hop.

04

Built for real time.

Sub-second synthesis and fast transcription on dedicated GPUs — low enough latency for live voicebots and IVR, not just batch jobs.

05

No caps on minutes.

Every minute of audio in and out is included. Limits are RPM and concurrency per key — never total tokens, so high call volume isn't a surprise bill.

In production across
  • Contact center / BPO
  • Telco
  • Media & agencies
  • Healthcare
  • HR & recruiting
  • Education
  • AI-native products
In production at

// voice faq

Voice, answered.

What CX, operations and engineering teams ask before moving voice in-house.

Which speech models do you offer?

whisper-large-v3 for transcription (99+ languages, 3.2% WER on Spanish, up to 25MB / ~2 min per file) and kokoro for text-to-speech (82M parameters, sub-second latency, 67 voices including Spanish).

Do you store call recordings or transcripts?

No. Zero logs — audio, transcripts and synthesized speech are never persisted and never train a model. Transcribing recordings stops being a privacy problem.

Is it fast enough for live voicebots?

Yes. On dedicated GPUs, kokoro synthesizes in under a second and transcription runs with low latency — enough for real-time voicebots and IVR, not just batch transcription.

Can I build a full voicebot — STT + LLM + TTS?

Yes, from one provider. Transcribe with whisper-large-v3, reason and respond with an LLM (deepseek-v4-flash, including tool calling), then speak with kokoro — all behind one OpenAI-compatible API.

Does it use the OpenAI audio API?

Yes. The audio.transcriptions and audio.speech endpoints are OpenAI-compatible — change the base URL and key and your existing code works.

What about strict compliance for recordings?

Run on a dedicated GPU or fully on-premise inside your own datacenter — the same API and code, with audio that never leaves your network. Built for contact centers, healthcare and the public sector.

// get started

START BURNING TOKENS

Skip the AI infra work. Deploy your first private inference endpoint today.

Flat rate. EU data. OpenAI API compatible.