267 models
$0.45/$2.70/M
ctx1.0Mmax64Kavailtps
InOutCap

Preview of Google's next-generation Gemini 3 Flash model, optimized for speed with frontier intelligence combined with superior search and grounding capabilities.

$0.35/$2.10/M
ctx1.0Mmax64Kavailtps
InOutCap

Vertex AI version of Gemini 3 Flash Preview, optimized for speed with frontier intelligence combined with superior search and grounding capabilities.

$1.93/$15.40/M
ctx400Kmax128Kavailtps

GPT-5.2 is OpenAI's best model for coding and agentic tasks across industries.

$1.40/$8.40/M
ctx1.0Mmax64Kavailtps
InOutCap

Gemini 3 Pro Preview on Vertex AI with extended thinking mode for complex reasoning tasks.

$0.25/$2.00/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5.1-Codex-Max is a version of GPT-5.1-Codex with enhanced capabilities for agentic coding tasks.

$1.80/$10.80/M
ctx1.0Mmax64Kavailtps

Preview of Google's next-generation Gemini 3 Pro model with enhanced reasoning and multimodal capabilities.

$1.80/$10.80/M
ctx1.0Mmax64Kavailtps
InOutCap

Preview of Gemini 3 Pro on Vertex AI with native image generation capabilities alongside text understanding.

$1.38/$11.00/M
ctx400Kmax128Kavailtps

GPT-5.1-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments. It's available in the Responses API only and the underlying model snapshot will be regularly updated. If you want to learn more about prompting GPT-5-Codex, refer to our dedicated guide

$1.38/$11.00/M
ctx400Kmax128Kavailtps

GPT-5.1 is the OpenAI's best model for coding and agentic tasks across industries.

$1.38/$11.00/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5.1 is the OpenAI's best model for coding and agentic tasks across industries.

$5.50/$27.50/M
ctx200Kmax64Kavailtps
InOutCap

November 2025 snapshot of Claude Opus 4.5, Anthropic's latest and most advanced model.

$1.10/$5.50/M
ctx200Kmax64Kavailtps
InOutCap

The fastest model in the Claude 4.5 family. Offers quick responses with strong performance for everyday tasks.

$1.10/$5.50/M
ctx200Kmax64Kavailtps
InOutCap

Snapshot of Claude Haiku 4.5 from October 1, 2025. Fast model for everyday tasks.

$0.62/$1.85/M
ctx160Kmaxavailtps
InOutCap

DeepSeek-V3.1-Terminus is an updated version of DeepSeek-V3.1 with enhanced language consistency, reduced mixed Chinese-English text, and optimized Code Agent and Search Agent performance.

$0.66/$2.75/M
ctx256Kmaxavailtps
InOutCap

Kimi K2 0905 is an updated version of Kimi K2, a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Kimi K2 0905 has improved coding abilities, a longer context window, and agentic tool use, and a longer (262K) context window.

$0.62/$1.85/M
ctx160Kmaxavailtps
InOutCap

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

$0.17/$0.66/M
ctx128Kmaxavailtps
InOutCap

Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-120b is used for production, general purpose, high reasoning use-cases that fits into a single H100 GPU.

$0.08/$0.33/M
ctx128Kmaxavailtps
InOutCap

Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-20b is used for lower latency, and local or specialized use-cases.

$0.24/$0.97/M
ctx256Kmaxavailtps
InOutCap

Latest Qwen3 thinking model, competitive against the best close source models in Jul 2025.

$0.50/$1.98/M
ctx256Kmaxavailtps
InOutCap

Qwen3's most agentic code model to date

$0.24/$0.97/M
ctx256Kmaxavailtps
InOutCap

Updated FP8 version of Qwen3-235B-A22B non-thinking mode, with better tool use, coding, instruction following, logical reasoning and text comprehension capabilities

$0.66/$2.75/M
ctx128Kmaxavailtps
InOutCap

Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.

$0.61/$2.41/M
ctx128Kmaxavailtps
InOutCap

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

$1.49/$5.94/M
ctx160Kmaxavailtps
InOutCap

05/28 updated checkpoint of Deepseek R1. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro. Compared to the previous version, the upgraded model shows significant improvements in handling complex reasoning tasks, and this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

$0.24/$0.97/M
ctx128Kmaxavailtps
InOutCap

Latest Qwen3 state of the art model, 235B with 22B active parameter model

$0.24/$0.97/M
ctx1.0Mmaxavailtps
InOutCap

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

$0.99/$0.99/M
ctx125Kmaxavailtps
InOutCap

Qwen2.5-VL is a multimodal large language model series developed by Qwen team, Alibaba Cloud, available in 3B, 7B, 32B, and 72B sizes

$0.99/$0.99/M
ctx160Kmaxavailtps
InOutCap

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token from Deepseek. Updated checkpoint.

$0.22/$0.22/M
ctx128Kmaxavailtps
InOutCap

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

$0.99/$0.99/M
ctx128Kmaxavailtps
InOutCap

Llama 3.3 70B Instruct is the December update of Llama 3.1 70B. The model improves upon Llama 3.1 70B (released July 2024) with advances in tool calling, multilingual text support, math and coding. The model achieves industry leading results in reasoning, math and instruction following and provides similar performance as 3.1 405B but with significant speed and cost improvements.

$0.17/$0.66/M
ctx1.0Mmaxavailtps
InOutCap

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

$0.99/$0.99/M
ctx128Kmaxavailtps
InOutCap

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

$0.22/$1.65/M
ctx256Kmax10Kavailtps
InOutCap

xAI's code-focused model optimized for fast responses in programming tasks.

$0.22/$1.65/M
ctx256Kmaxavailtps
InOutCap

xAI's code-focused model optimized for fast responses in programming tasks.

$0.22/$1.65/M
ctx256Kmaxavailtps
InOutCap

August 2025 snapshot of Grok Code Fast for programming tasks.

$0.22/$0.55/M
ctx2.0Mmax30Kavailtps
InOutCap

Grok 4 Fast with reasoning mode, using 40% fewer thinking tokens than Grok 4 with 2M token context.

$0.22/$0.55/M
ctx2.0Mmax30Kavailtps
InOutCap

Faster version of Grok 4 with 2M token context, designed for enterprise customers with 40% fewer thinking tokens.

$0.22/$0.55/M
ctx2.0Mmaxavailtps
InOutCap

Latest Grok 4 Fast with reasoning mode and 2M token context.

$0.22/$0.55/M
ctx2.0Mmax30Kavailtps
InOutCap

Grok 4 Fast configured for direct responses without extended thinking mode.

$0.22/$0.55/M
ctx2.0Mmaxavailtps
InOutCap

Latest Grok 4 Fast configured for direct responses without extended thinking mode.

¥2.20/¥8.80/M
ctx205Kmax205Kavailtps
InOutCap

Latest GLM model from Zhipu AI with improved reasoning and generation capabilities.

$16.50/$132.00/M
ctx400Kmax272Kavailtps
InOutCap

GPT-5 pro uses more compute to think harder and provide consistently better answers.

$16.50/$132.00/M
ctx400Kmax272Kavailtps
InOutCap

GPT-5 pro uses more compute to think harder and provide consistently better answers.

$0.66/$2.64/M
ctx128Kmax16Kavailtps
InOut

A cost-efficient version of GPT Audio. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

$0.66/$2.64/M
ctx128Kmax16Kavailtps
InOut

A cost-efficient version of GPT Audio. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOut

The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOut

The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

$0.66/$2.64/M
ctx32Kmax4Kavailtps
InOut

A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.

$0.66/$2.64/M
ctx32Kmax4Kavailtps
InOut

A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.

$4.40/$17.60/M
ctx32Kmax4Kavailtps
InOut

This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.

$4.40/$17.60/M
ctx32Kmax4Kavailtps
InOut

This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.

$2.20//M
ctx128Kmaxavailtps
InOut

Cost-efficient version of GPT Image 1 for fast image generation and editing tasks.

$3.30/$16.50/M
ctx200Kmax64Kavailtps
InOutCap

Anthropic's most intelligent model yet. Excels at coding, analysis, and complex reasoning with state-of-the-art performance.

$3.30/$16.50/M
ctx200Kmax64Kavailtps
InOutCap

Snapshot of Claude Sonnet 4.5 from September 29, 2025. Most intelligent model with exceptional coding and reasoning.

$22.00/$88.00/M
ctx200Kmax100Kavailtps
InOutCap

Snapshot of o3-pro from June 10, 2025. Our most capable reasoning model for the hardest problems.

$22.00/$88.00/M
ctx200Kmax100Kavailtps
InOutCap

Our most capable reasoning model, using more compute for the best possible answers on the hardest problems.

$0.25/$2.00/M
ctx400Kmax128Kavailtps

GPT-5 Codex delivers the flagship GPT-5 experience through the dedicated Codex API surface, supporting the same multimodal capabilities with a specialized routing endpoint.

$0.21/$21.00/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.5 Flash on Vertex AI with native image generation capabilities alongside text and image understanding.

$0.07/$0.28/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.5 Flash Lite on Vertex AI configured for direct responses without extended thinking.

$0.07/$0.28/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.5 Flash Lite on Vertex AI with extended thinking mode for complex reasoning tasks.

$0.06/$0.22/M
ctx128Kmaxavailtps
InOutCap

OpenAI's open-weight 20B model for lower latency and local use cases, hosted on TogetherAI.

¥0.80/¥8.00/M
ctx128Kmaxavailtps
InOutCap

ByteDance's multimodal model with vision capabilities for text, image, and video understanding.

¥4.00/¥12.00/M
ctx128Kmaxavailtps
InOutCap

Latest version of DeepSeek V3.1 hosted on ByteDance infrastructure, offering state-of-the-art performance with 128K context. Enhanced for coding, reasoning, and general-purpose AI tasks.

$0.17/$0.66/M
ctx128Kmaxavailtps
InOutCap

OpenAI's open-weight 120B model for production and high reasoning use cases, hosted on TogetherAI.

$0.66/$1.87/M
ctx128Kmaxavailtps
InOutCap

DeepSeek V3.1 hybrid model combining V3 and R1 capabilities with 128K context, hosted on TogetherAI.

$0.21/$1.75/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.5 Flash on Vertex AI with extended thinking mode for complex reasoning tasks.

$0.21/$1.75/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.5 Flash on Vertex AI configured for direct responses without extended thinking.

$0.88/$7.00/M
ctx128Kmaxavailtps
InOutCap

Google's most intelligent AI model on Vertex AI with adaptive thinking and multimodal capabilities.

$0.11/$0.42/M
ctx128Kmaxavailtps
InOutCap

Gemini 2.0 Flash on Vertex AI for fast and efficient multimodal tasks with text, image, audio and video support.

$0.05/$0.21/M
ctx128Kmaxavailtps
InOutCap

A lightweight and fast version of Gemini 2.0 Flash optimized for cost-effective multimodal tasks on Google's Vertex AI platform.

¥2.20/¥6.60/M
ctx64Kmax16Kavailtps
InOutCap

Zhipu AI's multimodal model with vision capabilities. Processes text and images for analysis tasks.

Free/Free
ctx131Kmax98Kavailtps
InOutCap

Fast, cost-efficient version of GLM-4.5. Optimized for high-throughput applications.

¥4.40/¥13.20/M
ctx128Kmaxavailtps
InOutCap

Zhipu AI's GLM-4.5 AirX variant optimized for high-speed inference.

¥0.88/¥2.20/M
ctx131Kmax98Kavailtps
InOutCap

Zhipu AI's lightweight GLM-4.5 variant for cost-effective tasks.

¥8.80/¥17.60/M
ctx128Kmaxavailtps
InOutCap

Zhipu AI's GLM-4.5 X variant with enhanced performance.

¥2.20/¥8.80/M
ctx131Kmax98Kavailtps
InOutCap

Zhipu AI's flagship Chinese-English bilingual model. Strong at complex reasoning and generation tasks.

$1.38/$11.00/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 Chat points to the GPT-5 snapshot currently used in ChatGPT. GPT-5 is our next-generation, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs.

$0.06/$0.44/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 Nano is our fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.

$0.06/$0.44/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 Nano is our fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.

$0.28/$2.20/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.

$0.28/$2.20/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.

$1.38/$11.00/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 is OpenAI's flagship model for coding, reasoning, and agentic tasks across domains.

$1.38/$11.00/M
ctx400Kmax128Kavailtps
InOutCap

GPT-5 is our flagship model for coding, reasoning, and agentic tasks across domains.

$0.09/$0.36/M
ctx1.0Mmax66Kavailtps
InOutCap

A lightweight version of Gemini 2.5 Flash optimized for speed and cost efficiency with 1M token context support.

$16.50/$82.50/M
ctx200Kmax32Kavailtps
InOutCap

August 2025 snapshot of Claude Opus 4.1, Anthropic's most capable model for complex tasks.

$16.50/$82.50/M
ctx200Kmax32Kavailtps
InOutCap

Anthropic's Claude Opus 4.1, an enhanced version of Opus 4 for highly complex tasks.

¥0.15/¥1.50/M
ctx256Kmax32Kavailtps
InOutCap

A high-speed version of Doubao Seed 1.6 optimized for fast inference with multimodal support. Supports 256K context with excellent performance on text, image, and video understanding tasks.

¥0.80/¥8.00/M
ctx256Kmax32Kavailtps
InOutCap

A reasoning-enhanced version of Doubao Seed 1.6 with extended thinking capabilities for complex problem-solving. Features 256K context window and advanced multimodal understanding.

$3.30/$16.50/M
ctx256Kmax128Kavailtps
InOutCap

xAI's most intelligent model with native tool use and real-time search integration. Claimed to outperform PhD-level on academic questions.

$3.30/$16.50/M
ctx256Kmax128Kavailtps
InOutCap

July 9, 2025 snapshot of Grok 4, xAI's most intelligent model.

$3.30/$16.50/M
ctx256Kmax128Kavailtps

xAI's most intelligent model with native tool use and real-time search. Frontier-level performance on reasoning benchmarks.

¥0.15/¥1.50/M
ctx256Kmax16Kavailtps
InOutCap

An earlier fast-inference variant of Doubao Seed 1.6 with 256K context support. Optimized for speed while maintaining strong multimodal capabilities across text, image, and video inputs.

¥0.80/¥8.00/M
ctx256Kmax16Kavailtps
InOutCap

Previous version of the thinking-enabled Doubao Seed 1.6 model designed for enhanced reasoning and complex cognitive tasks. Supports 256K context with multimodal input processing.

¥0.80/¥2.00/M
ctx256Kmax16Kavailtps
InOutCap

Standard version of Doubao Seed 1.6 providing balanced performance for general-purpose tasks. Features 256K context window and supports text, image, and video understanding.

$0.27/$2.25/M
ctx1.0Mmax66Kavailtps
InOutCap

Google's most efficient workhorse model designed for speed and low-cost. Improved across key benchmarks for reasoning, multimodality, code and long context while being 20-30% more efficient.

$1.13/$9.00/M
ctx1.0Mmax66Kavailtps
InOutCap

Google's most intelligent AI model with adaptive thinking capabilities. Among the world's best models for coding and tasks requiring advanced reasoning.

¥2.00/¥8.00/M
ctx128Kmax16Kavailtps
InOutCap

DeepSeek's reasoning-focused model hosted on ByteDance infrastructure, optimized for complex problem-solving and logical reasoning tasks. Supports 128K context with strong analytical capabilities.

¥0.80/¥2.00/M
ctx128Kmax16Kavailtps
InOutCap

Professional-grade Doubao 1.5 model with 128K context window, delivering high-quality text generation and understanding. Optimized for production workloads requiring reliable and accurate responses.

¥1.50/¥4.50/M
ctx64Kmax16Kavailtps
InOutCap

Lightweight vision-language model from the Doubao 1.5 series, balancing efficiency with multimodal understanding. Supports text and image inputs with 64K context for cost-effective visual tasks.

¥3.00/¥9.00/M
ctx64Kmax16Kavailtps
InOutCap

Advanced vision-language model with enhanced image understanding and analysis capabilities. Features 64K context window and excels at complex visual reasoning and multimodal tasks.

¥3.00/¥9.00/M
ctx64Kmax16Kavailtps
InOutCap

Premium multimodal model combining thinking capabilities with advanced vision understanding. Supports text, image, and video inputs with 64K context for sophisticated reasoning over visual content.

¥4.00/¥16.00/M
ctx64Kmax16Kavailtps
InOutCap

Professional thinking-enhanced model designed for complex reasoning and analytical tasks. Supports 64K context with text and image inputs, excelling at multi-step problem solving.

$3.30/$16.50/M
ctx200Kmax64Kavailtps
InOutCap

High-performance model balancing intelligence and speed. Excellent for complex tasks requiring nuanced understanding.

$3.30/$16.50/M
ctx200Kmax64Kavailtps
InOutCap

Snapshot of Claude Sonnet 4 from May 14, 2025. High-performance model for complex tasks.

$16.50/$82.50/M
ctx200Kmax32Kavailtps
InOutCap

Anthropic's most capable model for highly complex tasks. Excels at open-ended analysis, multi-step reasoning, and research.

$16.50/$82.50/M
ctx200Kmax32Kavailtps
InOutCap

Snapshot of Claude Opus 4 from May 14, 2025. Most capable model for highly complex tasks.

$0.22/$0.66/M
ctx128Kmax33Kavailtps
InOut

Qwen3 235B model with 22B active parameters optimized for throughput, hosted on TogetherAI.

$5.50//M
ctx128Kmaxavailtps
InOut

Advanced image generation model with multimodal input support for editing and creating images.

$1.21/$4.84/M
ctx200Kmax100Kavailtps
InOutCap

A smaller model optimized for fast, cost-efficient reasoning. Achieves remarkable performance for its size, particularly in math, coding, and visual tasks.

¥4.00/¥16.00/M
ctx128Kmax16Kavailtps
InOutCap

Advanced thinking model from the Doubao 1.5 series with enhanced reasoning capabilities for complex analytical tasks. Features 128K context window and excels at multi-step logical reasoning.

$2.20/$8.80/M
ctx200Kmax100Kavailtps
InOutCap

Our smartest reasoning model, trained to think for longer before responding. Excels at programming, business/consulting, and creative ideation with breakthrough performance on complex tasks.

$2.20/$8.80/M
ctx200Kmax100Kavailtps
InOutCap

Snapshot of o3 from April 16, 2025. Our smartest reasoning model with breakthrough performance on complex tasks.

$1.21/$4.84/M
ctx200Kmax100Kavailtps
InOutCap

Snapshot of o4-mini from April 16, 2025. Fast, cost-efficient reasoning model excelling at math, coding, and visual tasks.

$2.20/$8.80/M
ctx1.0Mmax33Kavailtps
InOutCap

Snapshot of GPT-4.1 from April 14, 2025, providing enhanced instruction following and multimodal capabilities.

$0.44/$1.76/M
ctx1.0Mmax33Kavailtps
InOutCap

GPT-4.1 Mini is a cost-efficient, faster version of GPT-4.1 optimized for everyday tasks and quick responses.

$0.44/$1.76/M
ctx1.0Mmax33Kavailtps
InOutCap

Snapshot of GPT-4.1 Mini from April 14, 2025, optimized for cost-efficient everyday tasks.

$0.11/$0.44/M
ctx1.0Mmax33Kavailtps
InOutCap

GPT-4.1 Nano is the smallest, fastest, and most affordable version of GPT-4.1 for simple classification and lightweight tasks.

$0.11/$0.44/M
ctx1.0Mmax33Kavailtps
InOutCap

Snapshot of GPT-4.1 Nano from April 14, 2025, optimized for fast, lightweight tasks.

$16.50//M
ctx128Kmaxavailtps
InOut

Real-time text-to-speech model optimized for speed with natural-sounding voice synthesis.

$33.00//M
ctx128Kmaxavailtps
InOut

High-definition text-to-speech model providing superior audio quality and clarity.

$16.50//M
ctx128Kmaxavailtps
InOut

November 2023 snapshot of TTS-1 for real-time text-to-speech conversion.

$33.00//M
ctx128Kmaxavailtps
InOut

November 2023 snapshot of TTS-1 HD for high-quality audio synthesis.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

Preview version of GPT-4o-mini enhanced with integrated web search capabilities for real-time information retrieval.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

Latest preview of GPT-4o-mini with integrated web search capabilities for accessing current information.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

Preview version of GPT-4o with integrated web search for enhanced real-time knowledge and information access.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

Latest preview of GPT-4o enhanced with web search capabilities for accessing up-to-date information.

$3.30/$16.50/M
ctx200Kmax4Kavailtps
InOutCap

Snapshot of Claude 3.7 Sonnet from February 19, 2025. Enhanced Sonnet with extended thinking capabilities.

$3.30/$16.50/M
ctx200Kmax4Kavailtps
InOutCap

Enhanced version of Claude 3.5 Sonnet with extended thinking capabilities for complex reasoning tasks.

$0.07/$0.27/M
ctx1.0Mmax8Kavailtps
InOutCap

A lightweight and fast version of Gemini 2.0 Flash optimized for cost-effective multimodal tasks with lower latency.

$0.07/$0.27/M
ctx1.0Mmax8Kavailtps
InOutCap

Google's most cost-efficient multimodal model with 1M token context, designed for high-volume applications requiring speed and affordability.

/
ctx128Kmaxavailtps
InOut

General-purpose speech recognition model. Transcribes and translates audio to text in multiple languages.

¥1.40/¥5.60/M
ctx64Kmax16Kavailtps
InOutCap

DeepSeek V3 model hosted on ByteDance platform with 64K context support. A powerful Mixture-of-Experts model delivering strong performance across coding and reasoning benchmarks.

$2.75/$11.00/M
ctx256Kmax8Kavailtps
InOutCap

Cohere's flagship Command A model featuring advanced reasoning capabilities and 256K context. Optimized for enterprise use cases requiring sophisticated analysis and instruction following.

$0.17/$0.66/M
ctx128Kmax4Kavailtps
InOutCap

Versatile Command R model designed for retrieval-augmented generation and conversational AI. Supports 128K context with strong multilingual capabilities and tool use.

$0.55/$1.65/M
ctx128Kmax4Kavailtps
InOutCap

March 2024 release of Command R optimized for RAG workflows and enterprise applications. Delivers strong performance on information retrieval and generation tasks.

$0.17/$0.66/M
ctx128Kmax4Kavailtps
InOutCap

August 2024 update of Command R with enhanced reasoning abilities and improved instruction following. Features better multilingual support and tool calling capabilities.

$2.75/$11.00/M
ctx128Kmax4Kavailtps
InOutCap

Enhanced version of Command R with superior performance on complex tasks. Excels at reasoning, coding, and multilingual understanding with 128K context support.

$3.30/$16.50/M
ctx128Kmax4Kavailtps
InOutCap

April 2024 release of Command R+ delivering premium performance for demanding enterprise applications. Strong at complex reasoning and multilingual tasks.

$2.75/$11.00/M
ctx128Kmax4Kavailtps
InOutCap

August 2024 update of Command R+ with advanced reasoning and improved capabilities. Features enhanced tool use and better performance on complex analytical tasks.

$0.04/$0.17/M
ctx128Kmax4Kavailtps
InOutCap

Compact 7B parameter Command R model from December 2024, balancing efficiency with capability. Ideal for cost-effective deployment while maintaining strong performance on core tasks.

$0.17/$0.66/M
ctx128Kmax4Kavailtps
InOutCap

Command R variant with internet access capabilities for real-time information retrieval. Combines conversational AI with web search for up-to-date responses.

$2.75/$11.00/M
ctx128Kmax4Kavailtps
InOutCap

Command R+ with internet access for real-time information retrieval and grounded responses.

¥0.00/¥0.00/M
ctx128Kmax16Kavailtps
InOutCap

Zhipu AI's GLM-3 Turbo model optimized for fast responses.

¥110.00/¥110.00/M
ctx128Kmax16Kavailtps
InOutCap

Zhipu AI's 4th generation model. Balanced performance for general-purpose Chinese and English tasks.

¥5.50/¥5.50/M
ctx128Kmax16Kavailtps
InOutCap

Zhipu AI's enhanced GLM-4 Plus with improved capabilities.

¥11.00/¥11.00/M
ctx8Kmax16Kavailtps
InOutCap

Zhipu AI's GLM-4 AirX variant optimized for high-speed inference.

¥0.55/¥0.55/M
ctx128Kmax16Kavailtps
InOutCap

Zhipu AI's lightweight GLM-4 variant for cost-effective tasks with 128K context.

¥1.10/¥1.10/M
ctx1.0Mmax16Kavailtps
InOutCap

GLM-4 variant with extended 1M token context window for processing very long documents.

¥0.11/¥0.06/M
ctx128Kmax16Kavailtps
InOutCap

Zhipu AI's fastest GLM-4 variant optimized for high-throughput inference.

Free/Free
ctx128Kmax16Kavailtps
InOutCap

Fast, lightweight GLM-4 variant. Cost-effective for high-volume tasks.

¥4.40/¥4.40/M
ctx8Kmax16Kavailtps
InOutCap

Enhanced vision-language model from Zhipu AI with improved image understanding capabilities.

¥55.00/¥55.00/M
ctx8Kmax16Kavailtps
InOutCap

GLM-4 with vision capabilities. Processes and understands both text and image inputs.

$16.50//M
ctx128Kmax4Kavailtps
InOut

Microsoft Azure's text-to-speech service for natural voice synthesis.

$2.20/$2.20/M
ctx128Kmax128Kavailtps
InOutCap

Mistral AI's flagship model for complex reasoning, coding, and multilingual tasks.

$0.17/$0.17/M
ctx128Kmax128Kavailtps
InOutCap

12B parameter open model developed with NVIDIA. Strong multilingual and coding capabilities.

/
ctx128Kmaxavailtps
InOutCap

OpenAI's latest multimodal content moderation model for safety filtering.

/
ctx128Kmaxavailtps
InOutCap

September 2024 snapshot of OpenAI's multimodal content moderation model.

U

Text Moderation

Deprecated
/
ctx128Kmaxavailtps
InOutCap

OpenAI's text-only content moderation model for safety filtering. Deprecated in favor of omni-moderation.

/
ctx128Kmaxavailtps
InOutCap

OpenAI's text moderation model version 007. Deprecated in favor of omni-moderation.

/
ctx128Kmaxavailtps
InOutCap

OpenAI's stable text moderation model. Deprecated in favor of omni-moderation.

¥0.70//M
ctx128Kmaxavailtps
InOut

ByteDance's large text embedding model for semantic search and similarity tasks.

¥0.50//M
ctx128Kmaxavailtps
InOut

ByteDance's text embedding model for semantic search and similarity tasks.

$0.01//M
ctx128Kmaxavailtps
InOut

Google Search API integration for web search capabilities.

/
ctx128Kmaxavailtps
InOut

Microsoft Bing Search API integration for web search capabilities.

$0.00//M
ctx128Kmaxavailtps
InOut

Serper API integration for Google Search capabilities.

/
ctx128Kmaxavailtps
InOutCap

File processing service for document handling and analysis.

¥2.00/¥3.00/M
ctx128Kmax64Kavailtps
InOut

DeepSeek's reasoning model trained via large-scale reinforcement learning, comparable to OpenAI o1 on math, code, and reasoning tasks.

¥0.99/¥0.99/M
ctx16Kmax16Kavailtps
InOutCap

01.AI's fast and efficient language model for general-purpose tasks.

¥6.00/¥6.00/M
ctx16Kmax16Kavailtps
InOutCap

01.AI's multimodal vision-language model for image understanding and analysis.

/
ctx128Kmaxavailtps
InOut

NovelAI's image generation model for creative artwork and illustrations.

$0.28/$1.38/M
ctx200Kmax4Kavailtps
InOutCap

Fastest and most compact model in the Claude 3 family. Ideal for quick responses and high-volume tasks.

$0.88/$4.40/M
ctx200Kmax8Kavailtps
InOutCap

Fast and cost-effective model with improved performance over Claude 3 Haiku. Great for everyday tasks.

$0.88/$4.40/M
ctx200Kmax8Kavailtps
InOutCap

Snapshot of Claude 3.5 Haiku from October 22, 2024. Fast and cost-effective for everyday tasks.

$3.30/$7.70/M
ctx64Kmax8Kavailtps
InOut

DeepSeek's reasoning model trained via large-scale reinforcement learning, hosted on TogetherAI.

$1.38/$1.38/M
ctx64Kmax8Kavailtps
InOut

DeepSeek V3 MoE model with 671B total parameters and 37B active, hosted on TogetherAI.

Free/Free
ctx128Kmaxavailtps
InOut

Free tier of DeepSeek R1 distilled to Llama 70B architecture, hosted on TogetherAI.

$2.20/$2.20/M
ctx128Kmaxavailtps
InOut

DeepSeek R1 reasoning model distilled to Llama 70B architecture, hosted on TogetherAI.

$0.20/$0.20/M
ctx128Kmaxavailtps
InOut

Meta's Llama 3.1 8B optimized for fast inference on TogetherAI.

$0.97/$0.97/M
ctx128Kmaxavailtps
InOut

Meta's Llama 3.1 70B optimized for fast inference on TogetherAI.

$3.85/$3.85/M
ctx128Kmaxavailtps
InOut

Meta's largest Llama 3.1 405B model optimized for fast inference on TogetherAI.

$1.32/$1.32/M
ctx128Kmaxavailtps
InOut

Alibaba's Qwen2.5 7B model optimized for fast inference on TogetherAI.

$0.22/$0.22/M
ctx128Kmaxavailtps
InOut

Meta's Llama 2 7B chat model for conversational AI, hosted on TogetherAI.

$0.66/$0.66/M
ctx128Kmaxavailtps
InOut

Mistral AI's 7B instruction-tuned model v0.1, hosted on TogetherAI.

$0.22/$0.22/M
ctx128Kmaxavailtps
InOut

Mistral AI's 7B instruction-tuned model v0.2 with improved performance, hosted on TogetherAI.

$0.24/$0.24/M
ctx128Kmaxavailtps
InOut

Meta's Llama 2 13B chat model for conversational AI, hosted on TogetherAI.

$0.33/$0.33/M
ctx128Kmaxavailtps
InOut

Google's Gemma 2 9B instruction-tuned model, hosted on TogetherAI.

$1.32/$1.32/M
ctx128Kmaxavailtps
InOut

Alibaba's QwQ reasoning model preview with enhanced thinking capabilities, hosted on TogetherAI.

$0.88/$0.88/M
ctx128Kmaxavailtps
InOut

Google's Gemma 2 27B instruction-tuned model, hosted on TogetherAI.

$1.32/$1.32/M
ctx128Kmaxavailtps
InOut

Alibaba's Qwen2.5 72B model optimized for fast inference on TogetherAI.

$0.66/$0.66/M
ctx128Kmaxavailtps
InOut

Mistral AI's Mixtral 8x7B MoE model instruction-tuned, hosted on TogetherAI.

$1.32/$1.32/M
ctx128Kmaxavailtps
InOut

Mistral AI's larger Mixtral 8x22B MoE model instruction-tuned, hosted on TogetherAI.

/
ctx128Kmaxavailtps
InOut

Black Forest Labs' fastest image generation model for quick creative outputs.

/
ctx128Kmaxavailtps
InOut

Black Forest Labs' professional-grade image generation model for high-quality outputs.

/
ctx128Kmaxavailtps
InOut

Black Forest Labs' development image generation model for experimentation and testing.

/
ctx128Kmax128Kavailtps
InOut

Black Forest Labs' updated professional image generation model with improved quality and consistency.

/
ctx128Kmax128Kavailtps
InOut

Black Forest Labs' highest quality image generation model for premium creative outputs.

$3.30/$16.50/M
ctx131Kmax128Kavailtps
InOutCap

xAI's advanced reasoning model with 1M token context, trained with 10x more compute than previous models on the Colossus supercluster.

$3.30/$16.50/M
ctx131Kmax128Kavailtps
InOutCap

Beta version of Grok 3 with advanced reasoning capabilities.

$5.50/$27.50/M
ctx131Kmax128Kavailtps
InOutCap

Faster, more cost-efficient version of Grok 3 for high-throughput tasks.

$5.50/$27.50/M
ctx131Kmax128Kavailtps
InOutCap

Beta version of Grok 3 Fast for faster responses.

$0.33/$0.55/M
ctx131Kmax128Kavailtps
InOutCap

Smaller, more cost-efficient version of Grok 3 with reasoning capabilities for everyday tasks.

$0.33/$0.55/M
ctx131Kmax128Kavailtps
InOutCap

Beta version of Grok 3 Mini for cost-efficient tasks.

$0.66/$4.40/M
ctx131Kmax128Kavailtps
InOutCap

Fast version of Grok 3 Mini optimized for speed and cost efficiency.

$0.66/$4.40/M
ctx131Kmax128Kavailtps
InOutCap

Beta version of Grok 3 Mini Fast.

/
ctx64Kmax8Kavailtps
InOut

Compact 1.5B parameter distilled version of DeepSeek R1 for efficient reasoning tasks.

$0.09/$0.36/M
ctx1.0Mmax8Kavailtps
InOutCap

Google's fast and efficient multimodal model with 1M token context, supporting text, image, audio, video and PDF inputs.

$0.09/$0.36/M
ctx1.0Mmax8Kavailtps
InOutCap

Snapshot of Gemini 2.0 Flash with multimodal support for text, image, audio, and video understanding.

$1.21/$4.84/M
ctx200Kmax100Kavailtps
InOutCap

A cost-efficient reasoning model that excels at STEM tasks, particularly science, math, and coding.

$1.21/$4.84/M
ctx200Kmax100Kavailtps
InOutCap

Snapshot of o3-mini from January 31, 2025. Cost-efficient reasoning model for STEM tasks.

$16.50/$66.00/M
ctx200Kmax100Kavailtps
InOutCap

A reasoning model designed to solve hard problems across domains. Uses chain of thought to think before responding.

$16.50/$66.00/M
ctx200Kmax100Kavailtps
InOutCap

Snapshot of o1 from December 17, 2024. Reasoning model that uses chain of thought to solve hard problems.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

A cost-efficient audio-capable model that accepts text, audio, and image inputs and can generate text and audio outputs.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

A cost-efficient audio-capable model that accepts text, audio, and image inputs and can generate text and audio outputs.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

GPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.

$5.50/$22.00/M
ctx128Kmax4Kavailtps
InOut

December 2024 snapshot of GPT-4o realtime preview with improved latency and audio quality.

$0.66/$2.64/M
ctx128Kmax4Kavailtps
InOut

Cost-efficient version of GPT-4o supporting real-time audio and text streaming for conversational applications.

$0.66/$2.64/M
ctx128Kmax4Kavailtps
InOut

December 2024 snapshot of GPT-4o-mini realtime with optimized performance for real-time interactions.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

November 2024 snapshot of GPT-4 Omni with enhanced creative writing and coding capabilities.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

GPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

GPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.

$5.50/$22.00/M
ctx128Kmax4Kavailtps
InOut

Preview version of GPT-4o supporting real-time audio and text streaming for conversational applications.

$5.50/$22.00/M
ctx128Kmax4Kavailtps
InOut

October 2024 snapshot of GPT-4o realtime preview with enhanced audio processing capabilities.

o1 Preview

Deprecated
$16.50/$66.00/M
ctx128Kmax33Kavailtps
InOutCap

Early preview of o1 reasoning model. Designed to spend more time thinking before responding on complex tasks.

$16.50/$66.00/M
ctx128Kmax16Kavailtps
InOutCap

Snapshot of o1-preview from September 12, 2024. Early preview of reasoning model capabilities.

$2.20/$2.20/M
ctx16Kmax4Kavailtps
InOut

Base model for fine-tuning and legacy applications, replacing the original davinci base model.

$0.44/$0.44/M
ctx16Kmax4Kavailtps
InOut

Lightweight base model for fine-tuning and simple tasks, replacing the original babbage base model.

$0.03/$0.14/M
ctx1.0Mmax8Kavailtps
InOutCap

Google's smallest Gemini model optimized for speed and cost efficiency with multimodal support.

$5.50/$16.50/M
ctx128Kmax16Kavailtps
InOutCap

The dynamic model used in ChatGPT, automatically updated to the latest GPT-4o snapshot.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

August 2024 snapshot of GPT-4 Omni with improved structured outputs and function calling.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

A fast, affordable small model for lightweight tasks with vision and text capabilities.

$0.17/$0.66/M
ctx128Kmax16Kavailtps
InOutCap

July 2024 snapshot of GPT-4o-mini, the initial release of OpenAI's affordable small model.

$0.07/$0.27/M
ctx1.0Mmax8Kavailtps
InOutCap

Google's fast, cost-efficient multimodal model with 1M token context for high-volume tasks.

$0.07/$0.27/M
ctx1.0Mmax8Kavailtps
InOutCap

Snapshot of Gemini 1.5 Flash with 1M token context for fast multimodal understanding.

$0.07/$0.27/M
ctx1.0Mmax8Kavailtps
InOutCap

Updated snapshot of Gemini 1.5 Flash with improved performance and 1M token context.

$2.75/$11.00/M
ctx128Kmax16Kavailtps
InOutCap

OpenAI's flagship multimodal model combining text and vision capabilities with GPT-4 level intelligence.

$5.50/$16.50/M
ctx128Kmax16Kavailtps
InOutCap

May 2024 snapshot of GPT-4 Omni, the initial release of OpenAI's flagship multimodal model.

$11.00/$33.00/M
ctx128Kmax4Kavailtps
InOutCap

GPT-4 Turbo with vision, featuring 128K context window and improved performance at lower cost.

$11.00/$33.00/M
ctx128Kmax4Kavailtps
InOutCap

April 2024 snapshot of GPT-4 Turbo with vision and extended context capabilities.

$1.13/$4.50/M
ctx2.1Mmax8Kavailtps
InOutCap

Google's mid-size multimodal model with 2M token context for text, image, audio, and video understanding.

$1.13/$4.50/M
ctx2.1Mmax8Kavailtps
InOutCap

Google's mid-size multimodal model with 2M token context for text, image, audio, and video understanding.

$1.13/$4.50/M
ctx2.1Mmax8Kavailtps
InOutCap

Snapshot of Gemini 1.5 Pro with 2M token context for multimodal understanding.

$1.13/$4.50/M
ctx2.1Mmax8Kavailtps
InOutCap

Updated snapshot of Gemini 1.5 Pro with improved performance and 2M token context.

$11.00/$33.00/M
ctx128Kmax16Kavailtps
InOutCap

Updated GPT-4 Turbo preview with reduced "laziness" and improved task completion.

$11.00/$33.00/M
ctx128Kmax4Kavailtps
InOutCap

GPT-4 Turbo preview model with 128K context window for handling longer inputs.

$0.11//M
ctx8Kmax2Kavailtps
InOut

Legacy embedding model generating 1536-dimensional vectors for semantic search and similarity tasks.

$0.02//M
ctx8Kmax2Kavailtps
InOut

Cost-efficient embedding model with improved performance over ada-002, supporting up to 8191 tokens.

$0.14//M
ctx8Kmax3Kavailtps
InOut

High-performance embedding model generating 3072-dimensional vectors for advanced semantic understanding.

$11.00/$33.00/M
ctx128Kmax4Kavailtps
InOutCap

GPT-4 Turbo with vision capabilities for understanding and analyzing images alongside text.

$3.30/$4.40/M
ctx16Kmax4Kavailtps
InOutCap

GPT-3.5 Turbo variant with extended 16K token context window for longer conversations and documents.

$1.10/$2.20/M
ctx16Kmax4Kavailtps
InOut

November 2023 snapshot of GPT-3.5 Turbo with improved instruction following and JSON mode support.

$33.00/$66.00/M
ctx8Kmax8Kavailtps
InOutCap

OpenAI's advanced language model with superior reasoning, creativity, and complex task handling capabilities.

$11.00/$33.00/M
ctx128Kmax4Kavailtps
InOutCap

GPT-4 Turbo preview with 128K context, improved instruction following, and JSON mode support.

/
ctx128Kmaxavailtps
InOut

Latest DALL-E model with enhanced prompt understanding and image quality for professional-grade outputs.

$1.65/$2.20/M
ctx16Kmax4Kavailtps
InOut

GPT-3.5 model optimized for single-turn instruction following via completion API endpoint.

$1.65/$2.20/M
ctx16Kmax4Kavailtps
InOutCap

September 2023 snapshot of GPT-3.5 Turbo Instruct for legacy completion API use cases.

$33.00/$66.00/M
ctx8Kmax8Kavailtps
InOutCap

GPT-4 snapshot from June 2023 with improved function calling support. Optimized for complex reasoning tasks.

$0.55/$1.65/M
ctx16Kmax4Kavailtps
InOut

A fast, cost-effective text generation model for simple tasks and high-volume applications.

$0.55/$1.65/M
ctx16Kmax4Kavailtps
InOut

January 2025 snapshot of GPT-3.5 Turbo with various improvements and bug fixes.

/
ctx128Kmaxavailtps
InOut

Image generation model creating realistic images and art from text descriptions with improved quality.

/
ctx128Kmaxavailtps
InOut

Original DALL-E model for generating images from text descriptions. Superseded by newer versions.

$2.20/$8.80/M
ctx1.0Mmax33Kavailtps
InOutCap

GPT-4.1 is an enhanced version of GPT-4 with improved instruction following and multimodal capabilities for text and image understanding.

¥2.00/¥3.00/M
ctx128Kmax8Kavailtps
InOutCap

DeepSeek's conversational AI model for general-purpose chat and coding tasks with 128K context.

Model Library - OhMyGPT