Speech to Text (STT)#

There are speech-to-text (STT) engines packaged in Debian... from pre-AI era: - CMU Sphinx / PocketSphinx. Accuracy is vintage. - Julius. Difficult to configure in modern desktops.

The AI-era solutions:

Speech-to-text (STT) on a Debian desktop in 2026, has 2 paths: - local processing (privacy-focused, uses your hardware) or - cloud-based (requires internet, usually faster/more accurate).

Engines:#

Tool	Processing	Privacy	Accuracy	Best For...
Whisper	Local	High	Excellent	Transcribing recordings/meetings.
VOSK	Local	High	Good	Real-time typing/dictation.
DeepSpeech	Local	High	Moderate	Older systems or specific use cases.

Real-Time Dictation: "Nerd Dictation" If you want to talk and see the text appear in your text editor (like LibreOffice or Gedit),

Nerd Dictation is the best lightweight tool for Linux. It uses the VOSK engine.

Why it's great: It doesn't need a GPU and is very snappy.

Installation: It's usually a Python script that you clone from GitHub. It depends on python3-vosk.

Workflow: You assign a keyboard shortcut to start/stop the "listening" mode.

GNOME Integrated Solution: "Dictation" (Extension/App)

If you prefer a GUI that integrates with your desktop:

Dictation (by ElioQoshi): Check the GNOME Software center or Flatpak.
It provides a simple "Record" button that sends text directly to your clipboard or focused window.

Amberol / NewsFlash / Decibels: Some of these newer GTK4 apps are beginning to integrate transcription features using Whisper in the background.

Models:#

Model Family	License Type	Commercial Use?	Key Restriction
Mistral	Apache 2.0	Yes	None (Very permissive).
Falcon (TII)	Apache 2.0	Yes	None (Very permissive).
Llama (Meta)	Custom (Open Weights)	Yes (limited)	700M+ users need permission.
Gemma (Google)	Custom	Yes	Usage restrictions apply.
GPT-4/Gemini	Closed Source	No (API only)	You don't own the model.

Noise Cancellation#

STT engines struggle with background hum or fan noise. For better accuracy, it is recommended to install NoiseTorch or the PipeWire Noise Suppression plugin.

The Best All-Rounder: OpenAI Whisper (Local)#

Whisper is currently the gold standard for open-source STT. It runs entirely on your machine.

How to get it: The easiest way to run it on Debian is via pip or using a specialized client like Whisper.cpp.

Requirements: A decent CPU or, ideally, an NVIDIA GPU.

Installation:

sudo apt install ffmpeg                      # Install ffmpeg first
pip install -U openai-whisper                # Install whisper

Usage: You provide an audio file, and it spits out text. whisper recording.mp3 --model medium

Speech to Text (STT)#

Engines:#

Models:#

See also:#

Noise Cancellation#

Related Case: Batch processing:#

The Best All-Rounder: OpenAI Whisper (Local)#