Tagged: openAI

RadioTranscriber: Real-Time Public Safety Radio Transcription with Whisper AI

Over in our new forums, user Nite has shared a new open-source project that he's created called RadioTranscriber, a real-time speech-to-text tool for public safety radio feeds using OpenAI’s Whisper large-v3 model. The idea is to take live scanner audio, such as authenticated streams from Broadcastify, and continuously turn it into readable text with minimal babysitting. The project grew out of earlier experiments with Radio Transcriptor, which we posted about back in June, but quickly evolved into a more robust, long-running setup with better audio conditioning and fewer of Whisper’s common hallucinations.

Under the hood, RadioTranscriber is a Python script that pulls in a live stream, cleans it up with filtering, normalization, and WebRTC VAD, then runs Whisper large-v3 with beam search for transcription. A set of custom “hallucination guards” strips out common junk text and replaces alert tones with simple markers, while daily log rotation and basic memory management let it run unattended for long periods, even on a modest CPU-only machine. Although it’s tuned to the author’s local dispatch style, the config and prompt are easy to adapt, and the full code is available on GitHub for anyone who wants to experiment or build on it.

How OpenAI's Whisper Works
How OpenAI's Whisper Works

Real Time Speech to Text from Radio Speech via DragonOS, SDR4Space, Mosquitto and WhisperCPP

Real time high quality speech to text is now possible with OpenAI's WhisperCPP, a high-performance and open source automatic speech recognition model.

In his latest video on YouTube, Aaron demonstrates how to use his latest DragonOS image to transcribe audio from a radio voice channel that is received with an RTL-SDR. He makes use of SDR4Space as the command line receiver, WhisperCPP as the AI transcriber and Mosquitto for monitoring WhisperCPP outputs and displaying the text to the terminal.

Here's a short video showing exactly how to setup and run SDR4space in such a way that real time IQ captures are demodulated and feed to WhisperCPP (High-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model) for transcribing.

The latest DragonOS FocalX R28 comes w/ everything needed to do exactly what I show in this video, to include a sample tiny model.

You'll noticed in the video that jobs are placed in a queue for continued captures and results are also sent over to Mosquitto MQTT where a client can see messages as they are created.

I chose to use an RTLSDR v3 dongle for the capture, but it's possible to configure SDR4space to use a variety of soapy supported SDRs.

In his first video Aaron shows how to get setup with the system on DragonOS. Shortly after uploading his first tutorial, Aaron noticed that recompiling WhisperCPP on the local system yielded a significant decrease in the processing time of the AI. After recompiling locally the transcribing then became near real time. In the second video Aaron briefly demonstrates the real time transcription. 

DragonOS FocalX Capture and Transcribe IQ w/ SDR4space/WhisperCPP/Mosquitto (RTLSDR, OpenAI)

DragonOS FocalX Captured IQ to Text Faster w/ SDR4space/WhisperCPP/Mosquitto (RTLSDR)

In the past we posted a similar project that was based on the Amazon Transcribe cloud service. However WhisperCPP runs on a local machine, is open source and seems to be at least as good as Amazon Transcribe. So this appears to be a significant leap in transcribing ability and we could see it being used to automatically create text logs and alerts based on various radio channels.