Model Comparison
Compare two models side by side to make informed decisions based on pricing, specifications, and performance.
Tool Invocation
Premium multimodal model combining thinking capabilities with advanced vision understanding. Supports text, image, and video inputs with 64K context for sophisticated reasoning over visual content.
Pricing
Input
¥3.00/MTokens
Output
¥9.00/MTokens
Specifications
Context
64,000
Maximum Output
16,384
Inputtext, image, video
Outputtext
No model selected