Access
Evaluate in the playground. Contact RinggAI for production access.
This Space provides product information for Ringg Parrot STT V1. The model weights, training code, and internal implementation are not open sourced.
- Playground access is available at ringg.ai.
- Model weights are not available for download from this Space.
- Production and commercial access requires RinggAI approval.
SDK and Integration
Integrate with voice-agent and real-time audio pipelines.
The Ringg SDK helps developers connect Ringg STT into application workflows. Ringg Parrot STT V1 is highly compatible with Pipecat toolkit using built-in VAD events.
- Python SDK is available through the ringglabs package on PyPI.
- Built for low-latency streaming speech recognition.
- Supports modern voice-agent orchestration patterns.
Benchmarks
WER comparison across ASR benchmark datasets.
WER stands for Word Error Rate. Lower values indicate better transcription accuracy. The lowest WER in each row is highlighted.
Original WER
Lower is better| Dataset | Ringg | ElevenLabs | Deepgram | Sarvam |
|---|---|---|---|---|
| indictts | 11.58 | 16.06 | 13.65 | 15.37 |
| commonvoice | 14.30 | 16.59 | 20.04 | 18.21 |
| fleurs | 15.20 | 11.99 | 17.14 | 16.00 |
| kathbath | 11.78 | 13.24 | 15.93 | 17.53 |
| kathbath_noisy | 13.09 | 13.14 | 17.44 | 16.19 |
| mucs | 14.55 | 11.69 | 21.97 | 16.72 |
| Overall WER | 13.79 | 13.00 | 19.23 | 16.72 |
Normalized WER
Lower is better| Dataset | Ringg | ElevenLabs | Deepgram | Sarvam |
|---|---|---|---|---|
| indictts | 3.94 | 8.52 | 6.93 | 7.84 |
| commonvoice | 6.37 | 13.02 | 14.88 | 13.06 |
| fleurs | 9.73 | 7.67 | 11.35 | 9.54 |
| kathbath | 7.15 | 10.15 | 11.38 | 10.41 |
| kathbath_noisy | 8.37 | 10.01 | 12.98 | 11.78 |
| mucs | 6.28 | 6.75 | 12.07 | 7.58 |
| Overall WER | 7.27 | 8.94 | 12.36 | 9.76 |
Features
- Hindi-English code-mixed speech recognition.
- Real-time streaming transcription.
- File-based transcription for common audio formats.
- Low-latency inference for voice products.
Supported Inputs
- Hindi, English, and code-mixed speech.
- Clear audio with minimal background noise.
- 16kHz or higher sample rate recommended.
- WAV, MP3, FLAC, M4A, OGG, and OPUS.
Use Cases
- Voice assistants and AI agents.
- Contact center transcription.
- Meeting and conversation intelligence.
- Voice search, subtitling, and accessibility workflows.
Limitations
- Accuracy may vary with noisy or low-quality audio.
- Overlapping speakers and dialect variation can affect quality.
- Very long files or unsupported encodings may require preprocessing.
- The hosted demo may differ from production deployment settings.
Benchmark Dataset
Released benchmark data and ASR transcriptions.
RinggAI has released the ASR Benchmarking Open-Source Dataset, which includes benchmark audio/data and transcriptions generated by Ringg, ElevenLabs, Deepgram, and Sarvam.
Privacy and Data Notice
Review deployment terms before using sensitive data.
Audio handling may depend on the selected deployment, integration, and commercial terms. Review RinggAI privacy terms and deployment documentation before using the service with sensitive, regulated, or personally identifiable data.