The Fort Worth Press - Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

USD -
AED 3.672504
AFN 64.000368
ALL 81.450403
AMD 370.780403
ANG 1.789884
AOA 918.000367
ARS 1392.916052
AUD 1.388889
AWG 1.8
AZN 1.70397
BAM 1.669697
BBD 2.01454
BDT 122.725158
BGN 1.668102
BHD 0.37765
BIF 2976
BMD 1
BND 1.275896
BOB 6.911331
BRL 4.953904
BSD 1.000226
BTN 94.881811
BWP 13.592996
BYN 2.822528
BYR 19600
BZD 2.011629
CAD 1.35975
CDF 2320.000362
CHF 0.781253
CLF 0.022842
CLP 899.000361
CNY 6.82825
CNH 6.831005
COP 3657.4
CRC 454.73562
CUC 1
CUP 26.5
CVE 94.450394
CZK 20.780394
DJF 177.720393
DKK 6.369404
DOP 59.503884
DZD 132.503944
EGP 53.639736
ERN 15
ETB 157.000358
EUR 0.85285
FJD 2.192104
FKP 0.736618
GBP 0.735159
GEL 2.680391
GGP 0.736618
GHS 11.203856
GIP 0.736618
GMD 73.000355
GNF 8775.000355
GTQ 7.641507
GYD 209.25239
HKD 7.832904
HNL 26.620388
HRK 6.42804
HTG 131.024649
HUF 311.140388
IDR 17334.35
ILS 2.94383
IMP 0.736618
INR 94.910504
IQD 1310
IRR 1314000.000352
ISK 122.680386
JEP 0.736618
JMD 156.725146
JOD 0.70904
JPY 156.57504
KES 129.150385
KGS 87.420504
KHR 4012.503796
KMF 420.00035
KPW 899.999976
KRW 1473.730383
KWD 0.30729
KYD 0.833543
KZT 463.288124
LAK 21980.000349
LBP 89550.000349
LKR 319.671116
LRD 183.875039
LSL 16.660381
LTL 2.95274
LVL 0.60489
LYD 6.350381
MAD 9.25125
MDL 17.233504
MGA 4150.000347
MKD 52.564485
MMK 2099.490131
MNT 3577.850535
MOP 8.070846
MRU 39.970379
MUR 47.030378
MVR 15.455039
MWK 1741.503736
MXN 17.457204
MYR 3.970377
MZN 63.903729
NAD 16.660377
NGN 1375.980377
NIO 36.710377
NOK 9.270804
NPR 151.803598
NZD 1.694485
OMR 0.384745
PAB 1.000201
PEN 3.507504
PGK 4.33875
PHP 61.275038
PKR 278.775038
PLN 3.62095
PYG 6151.626275
QAR 3.643504
RON 4.438104
RSD 100.106587
RUB 74.972586
RWF 1461.5
SAR 3.74998
SBD 8.04211
SCR 13.746323
SDG 600.503676
SEK 9.250404
SGD 1.272604
SHP 0.746601
SLE 24.603667
SLL 20969.496166
SOS 571.000338
SRD 37.458038
STD 20697.981008
STN 21.21
SVC 8.7523
SYP 110.524981
SZL 16.660369
THB 32.513038
TJS 9.381822
TMT 3.505
TND 2.88175
TOP 2.40776
TRY 45.142504
TTD 6.789386
TWD 31.629504
TZS 2605.000335
UAH 43.949336
UGX 3760.987334
UYU 39.889518
UZS 11950.000334
VES 488.942755
VND 26356
VUV 117.651389
WST 2.715189
XAF 560.041494
XAG 0.01327
XAU 0.000217
XCD 2.70255
XCG 1.80265
XDR 0.69563
XOF 560.000332
XPF 102.150363
YER 238.603589
ZAR 16.665525
ZMK 9001.203584
ZMW 18.67895
ZWL 321.999592
  • RBGPF

    0.5000

    63.1

    +0.79%

  • BCE

    0.1800

    23.96

    +0.75%

  • AZN

    -2.6300

    184.74

    -1.42%

  • RELX

    -0.2400

    36.35

    -0.66%

  • RIO

    0.1000

    100.58

    +0.1%

  • GSK

    -0.7000

    51.61

    -1.36%

  • BP

    -0.9700

    46.41

    -2.09%

  • CMSC

    0.0600

    22.88

    +0.26%

  • NGG

    -1.0600

    88.48

    -1.2%

  • BTI

    -0.0900

    58.71

    -0.15%

  • BCC

    -1.1400

    78.13

    -1.46%

  • JRI

    -0.0100

    12.98

    -0.08%

  • RYCEF

    0.5500

    16.35

    +3.36%

  • VOD

    0.3500

    16.15

    +2.17%

  • CMSD

    0.1500

    23.28

    +0.64%

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost
Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate's ELM model architecture unlocks transcription for the masses, cutting costs by 10x while achieving industry-leading accuracy.

Text size:

BOSTON, MA / ACCESS Newswire / March 18, 2026 / Modulate, the frontier conversational voice intelligence company, today announced Velma Transcribe, a speech-to-text API delivering high-accuracy, low-latency transcription at 90% lower cost per hour than other leading transcription providers. This significantly lower price point represents a fundamental shift in the economics of transcription. For a fraction of the cost, Modulate unlocks affordable speech-to-text transcription for every audio conversation in the world, empowering real-time voice agents, call center platforms, social apps, and more with industry-leading transcription tools at a global scale.

Built using Modulate's industry-leading Ensemble Listening Model (ELM) research, Velma Transcribe orchestrates an ensemble of specialized transcription models to improve accuracy, latency, and cost efficiency compared to any single model. In addition to the outstanding unit economics, Velma Transcribe achieves industry-leading results on widely used datasets, including Earnings-22 and the AMI Meeting Corpus. The result is a new standard for conversational audio transcription, combining strong accuracy on complex multi-speaker audio with dramatically improved unit economics for processing voice data at scale.

"Modulate is the world leader in using voice understanding AI, and our goal is to make the tools to understand audio available to anyone, at any scale," said Carter Huffman, CTO and Cofounder of Modulate. "Our full ensemble for conversation understanding, Velma, already outperforms LLMs in recognizing key behaviors, and now Velma Transcribe makes one of our core underlying capabilities available directly to developers who simply need accurate transcripts, not behavioral insights."

In addition, Velma Transcribe offers features built for Enterprise use cases:

  • Emotion detection (20+ emotions)

  • Accent detection (20+ accents)

  • Multilingual (70+ languages)

  • PII redaction, diarization, streaming support, and more

Lower Transcription Costs By up to 10X

Velma Transcribe reduces transcription costs to approximately $0.03 per hour of audio, more than 90% lower than leading providers. These economics make it far more cost-effective for enterprise organizations to analyze and monetize their voice data.

  • $0.03 - Modulate Velma Transcribe

  • $0.40 - ElevenLabs Scribe v2

  • $0.31 - Deepgram Nova-3

  • $0.26 - Deepgram Nova-2

  • $0.21 - AssemblyAI Universal-3 Pro

*Based on publicly listed pricing as of March 18, 2026

Compare the leading speech-to-text transcription companies on cost and accuracy at Speechtxt.com.

Top Marks for Conversational Audio Accuracy at Scale

Velma Transcribe is engineered for real-world conversations that challenge traditional systems, including overlapping speakers, interruptions, accents, and background noise. On the AMI Meeting Corpus dataset, a widely used benchmark for complex multi-speaker conversational audio, Velma avoids over 40% of the errors made by Eleven Labs and over 70% of the errors made by OpenAI GPT-4o-transcribe.

Huffman explains the top marks, "We've tuned Velma for conversational audio, including emotion and accent detection, leading to materially lower error rates on meeting and call data while delivering dramatic cost savings versus incumbent providers. That combination makes high-quality transcription practical at scale."

Built for Secure Enterprise Voice Production

Velma Transcribe includes all the capabilities developers expect and enterprise operations need, including:

  • Batch and streaming transcription endpoints with structured output and segment timestamps

  • Zero data stored, ensuring privacy-safe workflows

  • Sub-second streaming latency with partial transcripts for live applications and agent pipelines

  • Robust formatting optimized for conversational speech and long recordings

  • Broad language coverage in 70 of the world's most commonly spoken languages

  • Personally Identifiable Information (PII) detection and redaction

  • Advanced transcription enrichments, including speaker diarization, emotion detection, and accent identification

Backed by Modulate's security practices and ISO 27001 certification, these capabilities allow developers to build secure, voice-enabled applications and help organizations extract insights from large volumes of conversational data.

Models that Listen and Understand

Velma Transcribe is part of Modulate's growing family of Velma 2.0 voice analytics models built to deliver a new, context-rich listening layer for AI systems. It represents the first step in Modulate's expanding developer API strategy, with additional capabilities planned across synthetic voice detection, emotion analysis, and deeper conversational intelligence. Together, these capabilities allow developers and enterprises to move beyond transcription to understand how conversations unfold, enabling applications such as fraud detection, customer sentiment analysis, compliance monitoring, and real-time decision support.

"The industry has spent years teaching AI how to generate and respond. The next frontier is teaching it how to listen," said Mike Pappas, CEO and Cofounder of Modulate. "Most systems today rely on transcription, reducing rich conversations to flat text and losing the signals humans naturally understand. Velma is the listening layer for AI, giving developers and enterprises the 'ears' needed to build voice-native applications that can capture the nuance and intent within spoken dialogue."

Availability and Pricing

Velma Transcribe is available today with batch and sub-second streaming transcription. Modulate pricing is usage-based and optimized for high-volume workloads: https://www.modulate.ai/pricing

About Modulate

Modulate is a voice intelligence company building AI models and APIs designed to understand real-world conversational audio at scale. Its technology combines speech recognition, acoustic analysis, and conversational context to deliver reliable, explainable, and cost-effective voice intelligence for developers and enterprises.

For more information or to get started, visit modulate.ai.

Media Contact

Megan Fasy
Grithaus Agency
(e) [email protected]
(m) +1 (617) 480-3674

###

SOURCE: Modulate



View the original press release on ACCESS Newswire

L.Rodriguez--TFWP