The Fort Worth Press - Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

USD -
AED 3.672504
AFN 62.999727
ALL 83.270873
AMD 375.888706
ANG 1.790083
AOA 917.000355
ARS 1396.224797
AUD 1.411472
AWG 1.8
AZN 1.701015
BAM 1.694676
BBD 2.008379
BDT 122.349598
BGN 1.709309
BHD 0.37781
BIF 2960.677954
BMD 1
BND 1.274197
BOB 6.890426
BRL 5.200704
BSD 0.997171
BTN 92.084068
BWP 13.55123
BYN 2.990906
BYR 19600
BZD 2.005433
CAD 1.36967
CDF 2265.000019
CHF 0.786655
CLF 0.022962
CLP 906.680087
CNY 6.88685
CNH 6.880535
COP 3699.93
CRC 467.393376
CUC 1
CUP 26.5
CVE 95.544878
CZK 21.19755
DJF 177.563655
DKK 6.47893
DOP 60.863387
DZD 132.174184
EGP 52.362766
ERN 15
ETB 155.670589
EUR 0.86706
FJD 2.208982
FKP 0.749449
GBP 0.74916
GEL 2.710173
GGP 0.749449
GHS 10.864206
GIP 0.749449
GMD 73.502223
GNF 8738.713758
GTQ 7.638218
GYD 208.619099
HKD 7.838495
HNL 26.392042
HRK 6.524795
HTG 130.799092
HUF 339.005499
IDR 16960
ILS 3.095805
IMP 0.749449
INR 92.747396
IQD 1306.240929
IRR 1314000.000027
ISK 124.189585
JEP 0.749449
JMD 156.863595
JOD 0.709
JPY 159.125499
KES 129.615223
KGS 87.449522
KHR 4001.525051
KMF 426.999867
KPW 899.9784
KRW 1494.575034
KWD 0.30658
KYD 0.830969
KZT 480.462708
LAK 21398.089379
LBP 89293.757284
LKR 310.517081
LRD 182.476724
LSL 16.681412
LTL 2.95274
LVL 0.60489
LYD 6.383523
MAD 9.3506
MDL 17.395034
MGA 4151.340672
MKD 53.380151
MMK 2100.10344
MNT 3571.101739
MOP 8.04861
MRU 39.666049
MUR 46.510218
MVR 15.450275
MWK 1728.988766
MXN 17.650895
MYR 3.916502
MZN 63.909858
NAD 16.681412
NGN 1355.939656
NIO 36.696532
NOK 9.593196
NPR 147.335494
NZD 1.71098
OMR 0.384523
PAB 0.997097
PEN 3.408199
PGK 4.302203
PHP 59.815023
PKR 278.401043
PLN 3.69688
PYG 6464.107308
QAR 3.635584
RON 4.415802
RSD 101.841991
RUB 83.726506
RWF 1458.298132
SAR 3.755174
SBD 8.045182
SCR 13.735904
SDG 600.999795
SEK 9.323205
SGD 1.278095
SHP 0.750259
SLE 24.600507
SLL 20969.510825
SOS 568.861238
SRD 37.624971
STD 20697.981008
STN 21.229399
SVC 8.724736
SYP 110.58576
SZL 16.684502
THB 32.532979
TJS 9.557607
TMT 3.51
TND 2.939436
TOP 2.40776
TRY 44.218903
TTD 6.765591
TWD 31.907972
TZS 2606.229686
UAH 43.810984
UGX 3764.086078
UYU 40.534979
UZS 12100.600048
VES 447.80816
VND 26310
VUV 119.592862
WST 2.733704
XAF 568.378412
XAG 0.01276
XAU 0.000204
XCD 2.70255
XCG 1.79711
XDR 0.70688
XOF 568.388262
XPF 103.338171
YER 238.550219
ZAR 16.749845
ZMK 9001.258187
ZMW 19.449511
ZWL 321.999592
  • RIO

    -1.4750

    88.325

    -1.67%

  • CMSC

    0.0000

    22.95

    0%

  • NGG

    -2.3100

    88.11

    -2.62%

  • BTI

    -1.9600

    58.59

    -3.35%

  • RYCEF

    0.1200

    16.9

    +0.71%

  • GSK

    -1.1450

    52.265

    -2.19%

  • BP

    0.5200

    44.37

    +1.17%

  • BCC

    -0.9900

    71.93

    -1.38%

  • VOD

    -0.2900

    14.46

    -2.01%

  • BCE

    -0.1700

    25.84

    -0.66%

  • AZN

    -2.4700

    188.82

    -1.31%

  • RBGPF

    0.1000

    82.5

    +0.12%

  • RELX

    -0.0800

    34.21

    -0.23%

  • CMSD

    0.0250

    22.905

    +0.11%

  • JRI

    -0.0800

    12.38

    -0.65%

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost
Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate's ELM model architecture unlocks transcription for the masses, cutting costs by 10x while achieving industry-leading accuracy.

Text size:

BOSTON, MA / ACCESS Newswire / March 18, 2026 / Modulate, the frontier conversational voice intelligence company, today announced Velma Transcribe, a speech-to-text API delivering high-accuracy, low-latency transcription at 90% lower cost per hour than other leading transcription providers. This significantly lower price point represents a fundamental shift in the economics of transcription. For a fraction of the cost, Modulate unlocks affordable speech-to-text transcription for every audio conversation in the world, empowering real-time voice agents, call center platforms, social apps, and more with industry-leading transcription tools at a global scale.

Built using Modulate's industry-leading Ensemble Listening Model (ELM) research, Velma Transcribe orchestrates an ensemble of specialized transcription models to improve accuracy, latency, and cost efficiency compared to any single model. In addition to the outstanding unit economics, Velma Transcribe achieves industry-leading results on widely used datasets, including Earnings-22 and the AMI Meeting Corpus. The result is a new standard for conversational audio transcription, combining strong accuracy on complex multi-speaker audio with dramatically improved unit economics for processing voice data at scale.

"Modulate is the world leader in using voice understanding AI, and our goal is to make the tools to understand audio available to anyone, at any scale," said Carter Huffman, CTO and Cofounder of Modulate. "Our full ensemble for conversation understanding, Velma, already outperforms LLMs in recognizing key behaviors, and now Velma Transcribe makes one of our core underlying capabilities available directly to developers who simply need accurate transcripts, not behavioral insights."

In addition, Velma Transcribe offers features built for Enterprise use cases:

  • Emotion detection (20+ emotions)

  • Accent detection (20+ accents)

  • Multilingual (70+ languages)

  • PII redaction, diarization, streaming support, and more

Lower Transcription Costs By up to 10X

Velma Transcribe reduces transcription costs to approximately $0.03 per hour of audio, more than 90% lower than leading providers. These economics make it far more cost-effective for enterprise organizations to analyze and monetize their voice data.

  • $0.03 - Modulate Velma Transcribe

  • $0.40 - ElevenLabs Scribe v2

  • $0.31 - Deepgram Nova-3

  • $0.26 - Deepgram Nova-2

  • $0.21 - AssemblyAI Universal-3 Pro

*Based on publicly listed pricing as of March 18, 2026

Compare the leading speech-to-text transcription companies on cost and accuracy at Speechtxt.com.

Top Marks for Conversational Audio Accuracy at Scale

Velma Transcribe is engineered for real-world conversations that challenge traditional systems, including overlapping speakers, interruptions, accents, and background noise. On the AMI Meeting Corpus dataset, a widely used benchmark for complex multi-speaker conversational audio, Velma avoids over 40% of the errors made by Eleven Labs and over 70% of the errors made by OpenAI GPT-4o-transcribe.

Huffman explains the top marks, "We've tuned Velma for conversational audio, including emotion and accent detection, leading to materially lower error rates on meeting and call data while delivering dramatic cost savings versus incumbent providers. That combination makes high-quality transcription practical at scale."

Built for Secure Enterprise Voice Production

Velma Transcribe includes all the capabilities developers expect and enterprise operations need, including:

  • Batch and streaming transcription endpoints with structured output and segment timestamps

  • Zero data stored, ensuring privacy-safe workflows

  • Sub-second streaming latency with partial transcripts for live applications and agent pipelines

  • Robust formatting optimized for conversational speech and long recordings

  • Broad language coverage in 70 of the world's most commonly spoken languages

  • Personally Identifiable Information (PII) detection and redaction

  • Advanced transcription enrichments, including speaker diarization, emotion detection, and accent identification

Backed by Modulate's security practices and ISO 27001 certification, these capabilities allow developers to build secure, voice-enabled applications and help organizations extract insights from large volumes of conversational data.

Models that Listen and Understand

Velma Transcribe is part of Modulate's growing family of Velma 2.0 voice analytics models built to deliver a new, context-rich listening layer for AI systems. It represents the first step in Modulate's expanding developer API strategy, with additional capabilities planned across synthetic voice detection, emotion analysis, and deeper conversational intelligence. Together, these capabilities allow developers and enterprises to move beyond transcription to understand how conversations unfold, enabling applications such as fraud detection, customer sentiment analysis, compliance monitoring, and real-time decision support.

"The industry has spent years teaching AI how to generate and respond. The next frontier is teaching it how to listen," said Mike Pappas, CEO and Cofounder of Modulate. "Most systems today rely on transcription, reducing rich conversations to flat text and losing the signals humans naturally understand. Velma is the listening layer for AI, giving developers and enterprises the 'ears' needed to build voice-native applications that can capture the nuance and intent within spoken dialogue."

Availability and Pricing

Velma Transcribe is available today with batch and sub-second streaming transcription. Modulate pricing is usage-based and optimized for high-volume workloads: https://www.modulate.ai/pricing

About Modulate

Modulate is a voice intelligence company building AI models and APIs designed to understand real-world conversational audio at scale. Its technology combines speech recognition, acoustic analysis, and conversational context to deliver reliable, explainable, and cost-effective voice intelligence for developers and enterprises.

For more information or to get started, visit modulate.ai.

Media Contact

Megan Fasy
Grithaus Agency
(e) [email protected]
(m) +1 (617) 480-3674

###

SOURCE: Modulate



View the original press release on ACCESS Newswire

L.Rodriguez--TFWP