The Fort Worth Press - Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

USD -
AED 3.6725
AFN 63.49826
ALL 81.649957
AMD 368.209891
ANG 1.790403
AOA 917.503082
ARS 1436.737304
AUD 1.423255
AWG 1.8
AZN 1.699145
BAM 1.685177
BBD 2.015096
BDT 122.817901
BGN 1.69088
BHD 0.377104
BIF 2991
BMD 1
BND 1.281762
BOB 6.938712
BRL 5.090801
BSD 1.000526
BTN 94.560525
BWP 13.406112
BYN 2.76997
BYR 19600
BZD 2.012252
CAD 1.41112
CDF 2320.000121
CHF 0.80157
CLF 0.022506
CLP 885.759871
CNY 6.75745
CNH 6.76406
COP 3435
CRC 455.716489
CUC 1
CUP 26.5
CVE 95.350078
CZK 20.80205
DJF 177.719866
DKK 6.43614
DOP 58.599944
DZD 132.878973
EGP 49.908197
ERN 15
ETB 158.375021
EUR 0.869425
FJD 2.2337
FKP 0.746465
GBP 0.753256
GEL 2.644999
GGP 0.746465
GHS 11.2977
GIP 0.746465
GMD 72.999684
GNF 8777.499016
GTQ 7.626359
GYD 209.290102
HKD 7.837115
HNL 26.697197
HRK 6.548899
HTG 130.666299
HUF 300.649642
IDR 17748.6
ILS 2.94124
IMP 0.746465
INR 94.309498
IQD 1310
IRR 1374999.999942
ISK 124.330031
JEP 0.746465
JMD 158.238482
JOD 0.709019
JPY 160.262999
KES 129.520178
KGS 87.449762
KHR 4012.493065
KMF 424.999812
KPW 900.00035
KRW 1511.864997
KWD 0.308098
KYD 0.8338
KZT 487.920041
LAK 22029.999804
LBP 89550.000054
LKR 335.185855
LRD 182.14983
LSL 16.194858
LTL 2.95274
LVL 0.60489
LYD 6.37502
MAD 9.245017
MDL 17.459223
MGA 4199.999949
MKD 53.086638
MMK 2099.945791
MNT 3579.382153
MOP 8.072446
MRU 40.080045
MUR 47.130241
MVR 15.460244
MWK 1736.000257
MXN 17.28633
MYR 4.064804
MZN 63.902105
NAD 16.201917
NGN 1359.119651
NIO 36.6101
NOK 9.616102
NPR 151.295881
NZD 1.730598
OMR 0.384498
PAB 1.000526
PEN 3.41251
PGK 4.38775
PHP 60.373009
PKR 278.298187
PLN 3.64767
PYG 6105.515298
QAR 3.640502
RON 4.507036
RSD 101.071054
RUB 72.971546
RWF 1488
SAR 3.751894
SBD 8.061424
SCR 14.115123
SDG 600.499323
SEK 9.51878
SGD 1.28203
SHP 0.746601
SLE 24.750291
SLL 20969.503664
SOS 571.507527
SRD 37.332026
STD 20697.981008
STN 21.4
SVC 8.754244
SYP 110.532098
SZL 16.19688
THB 32.534501
TJS 9.274765
TMT 3.51
TND 2.91175
TOP 2.40776
TRY 46.445205
TTD 6.796543
TWD 31.558502
TZS 2625.00297
UAH 44.808889
UGX 3701.565583
UYU 40.393596
UZS 12004.999858
VES 596.036397
VND 26326
VUV 118.988901
WST 2.739751
XAF 565.192704
XAG 0.014646
XAU 0.000233
XCD 2.70255
XCG 1.803205
XDR 0.703697
XOF 565.000179
XPF 103.250281
YER 238.625025
ZAR 16.38061
ZMK 9001.192896
ZMW 17.684109
ZWL 321.999592
  • CMSD

    0.0300

    22.29

    +0.13%

  • CMSC

    -0.0450

    22.32

    -0.2%

  • RIO

    -3.0700

    102.67

    -2.99%

  • RYCEF

    -0.1600

    18.43

    -0.87%

  • VOD

    -0.3600

    14.53

    -2.48%

  • RELX

    -0.7900

    32.01

    -2.47%

  • BCE

    -0.5400

    23.28

    -2.32%

  • RBGPF

    -1.7300

    61.14

    -2.83%

  • BCC

    -0.7500

    70.81

    -1.06%

  • NGG

    -1.6000

    80.68

    -1.98%

  • JRI

    -0.1900

    12.62

    -1.51%

  • BTI

    -1.8900

    59.49

    -3.18%

  • GSK

    -0.0700

    52.15

    -0.13%

  • AZN

    -0.8200

    177.89

    -0.46%

  • BP

    -1.0100

    40.14

    -2.52%

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost
Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate's ELM model architecture unlocks transcription for the masses, cutting costs by 10x while achieving industry-leading accuracy.

Text size:

BOSTON, MA / ACCESS Newswire / March 18, 2026 / Modulate, the frontier conversational voice intelligence company, today announced Velma Transcribe, a speech-to-text API delivering high-accuracy, low-latency transcription at 90% lower cost per hour than other leading transcription providers. This significantly lower price point represents a fundamental shift in the economics of transcription. For a fraction of the cost, Modulate unlocks affordable speech-to-text transcription for every audio conversation in the world, empowering real-time voice agents, call center platforms, social apps, and more with industry-leading transcription tools at a global scale.

Built using Modulate's industry-leading Ensemble Listening Model (ELM) research, Velma Transcribe orchestrates an ensemble of specialized transcription models to improve accuracy, latency, and cost efficiency compared to any single model. In addition to the outstanding unit economics, Velma Transcribe achieves industry-leading results on widely used datasets, including Earnings-22 and the AMI Meeting Corpus. The result is a new standard for conversational audio transcription, combining strong accuracy on complex multi-speaker audio with dramatically improved unit economics for processing voice data at scale.

"Modulate is the world leader in using voice understanding AI, and our goal is to make the tools to understand audio available to anyone, at any scale," said Carter Huffman, CTO and Cofounder of Modulate. "Our full ensemble for conversation understanding, Velma, already outperforms LLMs in recognizing key behaviors, and now Velma Transcribe makes one of our core underlying capabilities available directly to developers who simply need accurate transcripts, not behavioral insights."

In addition, Velma Transcribe offers features built for Enterprise use cases:

  • Emotion detection (20+ emotions)

  • Accent detection (20+ accents)

  • Multilingual (70+ languages)

  • PII redaction, diarization, streaming support, and more

Lower Transcription Costs By up to 10X

Velma Transcribe reduces transcription costs to approximately $0.03 per hour of audio, more than 90% lower than leading providers. These economics make it far more cost-effective for enterprise organizations to analyze and monetize their voice data.

  • $0.03 - Modulate Velma Transcribe

  • $0.40 - ElevenLabs Scribe v2

  • $0.31 - Deepgram Nova-3

  • $0.26 - Deepgram Nova-2

  • $0.21 - AssemblyAI Universal-3 Pro

*Based on publicly listed pricing as of March 18, 2026

Compare the leading speech-to-text transcription companies on cost and accuracy at Speechtxt.com.

Top Marks for Conversational Audio Accuracy at Scale

Velma Transcribe is engineered for real-world conversations that challenge traditional systems, including overlapping speakers, interruptions, accents, and background noise. On the AMI Meeting Corpus dataset, a widely used benchmark for complex multi-speaker conversational audio, Velma avoids over 40% of the errors made by Eleven Labs and over 70% of the errors made by OpenAI GPT-4o-transcribe.

Huffman explains the top marks, "We've tuned Velma for conversational audio, including emotion and accent detection, leading to materially lower error rates on meeting and call data while delivering dramatic cost savings versus incumbent providers. That combination makes high-quality transcription practical at scale."

Built for Secure Enterprise Voice Production

Velma Transcribe includes all the capabilities developers expect and enterprise operations need, including:

  • Batch and streaming transcription endpoints with structured output and segment timestamps

  • Zero data stored, ensuring privacy-safe workflows

  • Sub-second streaming latency with partial transcripts for live applications and agent pipelines

  • Robust formatting optimized for conversational speech and long recordings

  • Broad language coverage in 70 of the world's most commonly spoken languages

  • Personally Identifiable Information (PII) detection and redaction

  • Advanced transcription enrichments, including speaker diarization, emotion detection, and accent identification

Backed by Modulate's security practices and ISO 27001 certification, these capabilities allow developers to build secure, voice-enabled applications and help organizations extract insights from large volumes of conversational data.

Models that Listen and Understand

Velma Transcribe is part of Modulate's growing family of Velma 2.0 voice analytics models built to deliver a new, context-rich listening layer for AI systems. It represents the first step in Modulate's expanding developer API strategy, with additional capabilities planned across synthetic voice detection, emotion analysis, and deeper conversational intelligence. Together, these capabilities allow developers and enterprises to move beyond transcription to understand how conversations unfold, enabling applications such as fraud detection, customer sentiment analysis, compliance monitoring, and real-time decision support.

"The industry has spent years teaching AI how to generate and respond. The next frontier is teaching it how to listen," said Mike Pappas, CEO and Cofounder of Modulate. "Most systems today rely on transcription, reducing rich conversations to flat text and losing the signals humans naturally understand. Velma is the listening layer for AI, giving developers and enterprises the 'ears' needed to build voice-native applications that can capture the nuance and intent within spoken dialogue."

Availability and Pricing

Velma Transcribe is available today with batch and sub-second streaming transcription. Modulate pricing is usage-based and optimized for high-volume workloads: https://www.modulate.ai/pricing

About Modulate

Modulate is a voice intelligence company building AI models and APIs designed to understand real-world conversational audio at scale. Its technology combines speech recognition, acoustic analysis, and conversational context to deliver reliable, explainable, and cost-effective voice intelligence for developers and enterprises.

For more information or to get started, visit modulate.ai.

Media Contact

Megan Fasy
Grithaus Agency
(e) [email protected]
(m) +1 (617) 480-3674

###

SOURCE: Modulate



View the original press release on ACCESS Newswire

L.Rodriguez--TFWP