Model Comparison
Compare two models side by side to make informed decisions based on pricing, specifications, and performance.
Tool Invocation
A cost-efficient audio-capable model that accepts text, audio, and image inputs and can generate text and audio outputs.
Pricing
Input
$0.17/MTokens
Output
$0.66/MTokens
Input Audio
$11.00/MTokens
Output Audio
$22.00/MTokens
Specifications
Context
128,000
Maximum Output
16,384
Inputtext, audio, image
Outputtext, audio
No model selected