Preview of Google's next-generation Gemini 3 Flash model, optimized for speed with frontier intelligence combined with superior search and grounding capabilities.
Vertex AI version of Gemini 3 Flash Preview, optimized for speed with frontier intelligence combined with superior search and grounding capabilities.
GPT-5.2 is OpenAI's best model for coding and agentic tasks across industries.
Gemini 3 Pro Preview on Vertex AI with extended thinking mode for complex reasoning tasks.
GPT-5.1-Codex-Max is a version of GPT-5.1-Codex with enhanced capabilities for agentic coding tasks.
Preview of Google's next-generation Gemini 3 Pro model with enhanced reasoning and multimodal capabilities.
Preview of Gemini 3 Pro on Vertex AI with native image generation capabilities alongside text understanding.
GPT-5.1-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments. It's available in the Responses API only and the underlying model snapshot will be regularly updated. If you want to learn more about prompting GPT-5-Codex, refer to our dedicated guide
GPT-5.1 is the OpenAI's best model for coding and agentic tasks across industries.
GPT-5.1 is the OpenAI's best model for coding and agentic tasks across industries.
November 2025 snapshot of Claude Opus 4.5, Anthropic's latest and most advanced model.
Anthropic's latest and most advanced Claude model, powering Claude Code.
The fastest model in the Claude 4.5 family. Offers quick responses with strong performance for everyday tasks.
Snapshot of Claude Haiku 4.5 from October 1, 2025. Fast model for everyday tasks.
DeepSeek-V3.1-Terminus is an updated version of DeepSeek-V3.1 with enhanced language consistency, reduced mixed Chinese-English text, and optimized Code Agent and Search Agent performance.
Kimi K2 0905 is an updated version of Kimi K2, a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Kimi K2 0905 has improved coding abilities, a longer context window, and agentic tool use, and a longer (262K) context window.
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.
Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-120b is used for production, general purpose, high reasoning use-cases that fits into a single H100 GPU.
Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-20b is used for lower latency, and local or specialized use-cases.
Latest Qwen3 thinking model, competitive against the best close source models in Jul 2025.
Qwen3's most agentic code model to date
Updated FP8 version of Qwen3-235B-A22B non-thinking mode, with better tool use, coding, instruction following, logical reasoning and text comprehension capabilities
Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.
The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.
05/28 updated checkpoint of Deepseek R1. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro. Compared to the previous version, the upgraded model shows significant improvements in handling complex reasoning tasks, and this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.
Latest Qwen3 state of the art model, 235B with 22B active parameter model
The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.
Qwen2.5-VL is a multimodal large language model series developed by Qwen team, Alibaba Cloud, available in 3B, 7B, 32B, and 72B sizes
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token from Deepseek. Updated checkpoint.
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Llama 3.3 70B Instruct is the December update of Llama 3.1 70B. The model improves upon Llama 3.1 70B (released July 2024) with advances in tool calling, multilingual text support, math and coding. The model achieves industry leading results in reasoning, math and instruction following and provides similar performance as 3.1 405B but with significant speed and cost improvements.
The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
xAI's code-focused model optimized for fast responses in programming tasks.
xAI's code-focused model optimized for fast responses in programming tasks.
August 2025 snapshot of Grok Code Fast for programming tasks.
Grok 4 Fast with reasoning mode, using 40% fewer thinking tokens than Grok 4 with 2M token context.
Faster version of Grok 4 with 2M token context, designed for enterprise customers with 40% fewer thinking tokens.
Latest Grok 4 Fast with reasoning mode and 2M token context.
Grok 4 Fast configured for direct responses without extended thinking mode.
Latest Grok 4 Fast configured for direct responses without extended thinking mode.
Latest GLM model from Zhipu AI with improved reasoning and generation capabilities.
GPT-5 pro uses more compute to think harder and provide consistently better answers.
GPT-5 pro uses more compute to think harder and provide consistently better answers.
A cost-efficient version of GPT Audio. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
A cost-efficient version of GPT Audio. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
Cost-efficient version of GPT Image 1 for fast image generation and editing tasks.
Anthropic's most intelligent model yet. Excels at coding, analysis, and complex reasoning with state-of-the-art performance.
Snapshot of Claude Sonnet 4.5 from September 29, 2025. Most intelligent model with exceptional coding and reasoning.
September 2025 preview of Gemini 2.5 Flash Lite on Vertex AI configured for direct responses.
September 2025 preview of Gemini 2.5 Flash Lite on Vertex AI with extended thinking mode.
September 2025 preview of Gemini 2.5 Flash on Vertex AI configured for direct responses.
September 2025 preview of Gemini 2.5 Flash on Vertex AI with extended thinking mode.
Snapshot of o3-pro from June 10, 2025. Our most capable reasoning model for the hardest problems.
Our most capable reasoning model, using more compute for the best possible answers on the hardest problems.
GPT-5 Codex delivers the flagship GPT-5 experience through the dedicated Codex API surface, supporting the same multimodal capabilities with a specialized routing endpoint.
Gemini 2.5 Flash on Vertex AI with native image generation capabilities alongside text and image understanding.
Gemini 2.5 Flash Lite on Vertex AI configured for direct responses without extended thinking.
Vertex Gemini 2.5 Flash Lite (Thinking)
DeprecatedGemini 2.5 Flash Lite on Vertex AI with extended thinking mode for complex reasoning tasks.
OpenAI's open-weight 20B model for lower latency and local use cases, hosted on TogetherAI.
ByteDance's multimodal model with vision capabilities for text, image, and video understanding.
Latest version of DeepSeek V3.1 hosted on ByteDance infrastructure, offering state-of-the-art performance with 128K context. Enhanced for coding, reasoning, and general-purpose AI tasks.
OpenAI's open-weight 120B model for production and high reasoning use cases, hosted on TogetherAI.
DeepSeek V3.1 hybrid model combining V3 and R1 capabilities with 128K context, hosted on TogetherAI.
Vertex Gemini 2.5 Flash (Thinking)
DeprecatedGemini 2.5 Flash on Vertex AI with extended thinking mode for complex reasoning tasks.
Vertex Gemini 2.5 Flash (No Thinking)
DeprecatedGemini 2.5 Flash on Vertex AI configured for direct responses without extended thinking.
Vertex Gemini 2.5 Pro
DeprecatedGoogle's most intelligent AI model on Vertex AI with adaptive thinking and multimodal capabilities.
Vertex Gemini 2.0 Flash (001)
DeprecatedGemini 2.0 Flash on Vertex AI for fast and efficient multimodal tasks with text, image, audio and video support.
Vertex Gemini 2.0 Flash Lite (001)
DeprecatedA lightweight and fast version of Gemini 2.0 Flash optimized for cost-effective multimodal tasks on Google's Vertex AI platform.
Zhipu AI's multimodal model with vision capabilities. Processes text and images for analysis tasks.
Fast, cost-efficient version of GLM-4.5. Optimized for high-throughput applications.
Zhipu AI's GLM-4.5 AirX variant optimized for high-speed inference.
Zhipu AI's lightweight GLM-4.5 variant for cost-effective tasks.
Zhipu AI's GLM-4.5 X variant with enhanced performance.
Zhipu AI's flagship Chinese-English bilingual model. Strong at complex reasoning and generation tasks.
GPT-5 Chat points to the GPT-5 snapshot currently used in ChatGPT. GPT-5 is our next-generation, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs.
GPT-5 Nano is our fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.
GPT-5 Nano is our fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.
GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.
GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.
GPT-5 is OpenAI's flagship model for coding, reasoning, and agentic tasks across domains.
GPT-5 is our flagship model for coding, reasoning, and agentic tasks across domains.
A lightweight version of Gemini 2.5 Flash optimized for speed and cost efficiency with 1M token context support.
August 2025 snapshot of Claude Opus 4.1, Anthropic's most capable model for complex tasks.
Anthropic's Claude Opus 4.1, an enhanced version of Opus 4 for highly complex tasks.
A high-speed version of Doubao Seed 1.6 optimized for fast inference with multimodal support. Supports 256K context with excellent performance on text, image, and video understanding tasks.
A reasoning-enhanced version of Doubao Seed 1.6 with extended thinking capabilities for complex problem-solving. Features 256K context window and advanced multimodal understanding.
xAI's most intelligent model with native tool use and real-time search integration. Claimed to outperform PhD-level on academic questions.
July 9, 2025 snapshot of Grok 4, xAI's most intelligent model.
xAI's most intelligent model with native tool use and real-time search. Frontier-level performance on reasoning benchmarks.
An earlier fast-inference variant of Doubao Seed 1.6 with 256K context support. Optimized for speed while maintaining strong multimodal capabilities across text, image, and video inputs.
Previous version of the thinking-enabled Doubao Seed 1.6 model designed for enhanced reasoning and complex cognitive tasks. Supports 256K context with multimodal input processing.
Standard version of Doubao Seed 1.6 providing balanced performance for general-purpose tasks. Features 256K context window and supports text, image, and video understanding.
Google's most efficient workhorse model designed for speed and low-cost. Improved across key benchmarks for reasoning, multimodality, code and long context while being 20-30% more efficient.
Google's most intelligent AI model with adaptive thinking capabilities. Among the world's best models for coding and tasks requiring advanced reasoning.
DeepSeek's reasoning-focused model hosted on ByteDance infrastructure, optimized for complex problem-solving and logical reasoning tasks. Supports 128K context with strong analytical capabilities.
Professional-grade Doubao 1.5 model with 128K context window, delivering high-quality text generation and understanding. Optimized for production workloads requiring reliable and accurate responses.
Lightweight vision-language model from the Doubao 1.5 series, balancing efficiency with multimodal understanding. Supports text and image inputs with 64K context for cost-effective visual tasks.
Advanced vision-language model with enhanced image understanding and analysis capabilities. Features 64K context window and excels at complex visual reasoning and multimodal tasks.
Premium multimodal model combining thinking capabilities with advanced vision understanding. Supports text, image, and video inputs with 64K context for sophisticated reasoning over visual content.
Professional thinking-enhanced model designed for complex reasoning and analytical tasks. Supports 64K context with text and image inputs, excelling at multi-step problem solving.
High-performance model balancing intelligence and speed. Excellent for complex tasks requiring nuanced understanding.
Snapshot of Claude Sonnet 4 from May 14, 2025. High-performance model for complex tasks.
Anthropic's most capable model for highly complex tasks. Excels at open-ended analysis, multi-step reasoning, and research.
Snapshot of Claude Opus 4 from May 14, 2025. Most capable model for highly complex tasks.
Qwen3 235B model with 22B active parameters optimized for throughput, hosted on TogetherAI.
Advanced image generation model with multimodal input support for editing and creating images.
A smaller model optimized for fast, cost-efficient reasoning. Achieves remarkable performance for its size, particularly in math, coding, and visual tasks.
Advanced thinking model from the Doubao 1.5 series with enhanced reasoning capabilities for complex analytical tasks. Features 128K context window and excels at multi-step logical reasoning.
Our smartest reasoning model, trained to think for longer before responding. Excels at programming, business/consulting, and creative ideation with breakthrough performance on complex tasks.
Snapshot of o3 from April 16, 2025. Our smartest reasoning model with breakthrough performance on complex tasks.
Snapshot of o4-mini from April 16, 2025. Fast, cost-efficient reasoning model excelling at math, coding, and visual tasks.
Snapshot of GPT-4.1 from April 14, 2025, providing enhanced instruction following and multimodal capabilities.
GPT-4.1 Mini is a cost-efficient, faster version of GPT-4.1 optimized for everyday tasks and quick responses.
Snapshot of GPT-4.1 Mini from April 14, 2025, optimized for cost-efficient everyday tasks.
GPT-4.1 Nano is the smallest, fastest, and most affordable version of GPT-4.1 for simple classification and lightweight tasks.
Snapshot of GPT-4.1 Nano from April 14, 2025, optimized for fast, lightweight tasks.
Real-time text-to-speech model optimized for speed with natural-sounding voice synthesis.
High-definition text-to-speech model providing superior audio quality and clarity.
November 2023 snapshot of TTS-1 for real-time text-to-speech conversion.
November 2023 snapshot of TTS-1 HD for high-quality audio synthesis.
Preview version of GPT-4o-mini enhanced with integrated web search capabilities for real-time information retrieval.
Latest preview of GPT-4o-mini with integrated web search capabilities for accessing current information.
Preview version of GPT-4o with integrated web search for enhanced real-time knowledge and information access.
Latest preview of GPT-4o enhanced with web search capabilities for accessing up-to-date information.
Claude 3.7 Sonnet (20250219)
DeprecatedSnapshot of Claude 3.7 Sonnet from February 19, 2025. Enhanced Sonnet with extended thinking capabilities.
Enhanced version of Claude 3.5 Sonnet with extended thinking capabilities for complex reasoning tasks.
A lightweight and fast version of Gemini 2.0 Flash optimized for cost-effective multimodal tasks with lower latency.
Google's most cost-efficient multimodal model with 1M token context, designed for high-volume applications requiring speed and affordability.
General-purpose speech recognition model. Transcribes and translates audio to text in multiple languages.
DeepSeek V3 model hosted on ByteDance platform with 64K context support. A powerful Mixture-of-Experts model delivering strong performance across coding and reasoning benchmarks.
Cohere's flagship Command A model featuring advanced reasoning capabilities and 256K context. Optimized for enterprise use cases requiring sophisticated analysis and instruction following.
Versatile Command R model designed for retrieval-augmented generation and conversational AI. Supports 128K context with strong multilingual capabilities and tool use.
March 2024 release of Command R optimized for RAG workflows and enterprise applications. Delivers strong performance on information retrieval and generation tasks.
August 2024 update of Command R with enhanced reasoning abilities and improved instruction following. Features better multilingual support and tool calling capabilities.
Enhanced version of Command R with superior performance on complex tasks. Excels at reasoning, coding, and multilingual understanding with 128K context support.
April 2024 release of Command R+ delivering premium performance for demanding enterprise applications. Strong at complex reasoning and multilingual tasks.
August 2024 update of Command R+ with advanced reasoning and improved capabilities. Features enhanced tool use and better performance on complex analytical tasks.
Compact 7B parameter Command R model from December 2024, balancing efficiency with capability. Ideal for cost-effective deployment while maintaining strong performance on core tasks.
Command R variant with internet access capabilities for real-time information retrieval. Combines conversational AI with web search for up-to-date responses.
Command R+ with internet access for real-time information retrieval and grounded responses.
Zhipu AI's GLM-3 Turbo model optimized for fast responses.
Zhipu AI's 4th generation model. Balanced performance for general-purpose Chinese and English tasks.
Zhipu AI's enhanced GLM-4 Plus with improved capabilities.
Zhipu AI's GLM-4 AirX variant optimized for high-speed inference.
Zhipu AI's lightweight GLM-4 variant for cost-effective tasks with 128K context.
GLM-4 variant with extended 1M token context window for processing very long documents.
Zhipu AI's fastest GLM-4 variant optimized for high-throughput inference.
Fast, lightweight GLM-4 variant. Cost-effective for high-volume tasks.
Enhanced vision-language model from Zhipu AI with improved image understanding capabilities.
GLM-4 with vision capabilities. Processes and understands both text and image inputs.
Microsoft Azure's text-to-speech service for natural voice synthesis.
Mistral AI's flagship model for complex reasoning, coding, and multilingual tasks.
12B parameter open model developed with NVIDIA. Strong multilingual and coding capabilities.
OpenAI's latest multimodal content moderation model for safety filtering.
September 2024 snapshot of OpenAI's multimodal content moderation model.
OpenAI's text-only content moderation model for safety filtering. Deprecated in favor of omni-moderation.
OpenAI's text moderation model version 007. Deprecated in favor of omni-moderation.
OpenAI's stable text moderation model. Deprecated in favor of omni-moderation.
ByteDance's large text embedding model for semantic search and similarity tasks.
ByteDance's text embedding model for semantic search and similarity tasks.
Google Search API integration for web search capabilities.
Microsoft Bing Search API integration for web search capabilities.
Serper API integration for Google Search capabilities.
File processing service for document handling and analysis.
DeepSeek's reasoning model trained via large-scale reinforcement learning, comparable to OpenAI o1 on math, code, and reasoning tasks.
01.AI's fast and efficient language model for general-purpose tasks.
01.AI's multimodal vision-language model for image understanding and analysis.
NovelAI's image generation model for creative artwork and illustrations.
Fastest and most compact model in the Claude 3 family. Ideal for quick responses and high-volume tasks.
Fast and cost-effective model with improved performance over Claude 3 Haiku. Great for everyday tasks.
Snapshot of Claude 3.5 Haiku from October 22, 2024. Fast and cost-effective for everyday tasks.
DeepSeek's reasoning model trained via large-scale reinforcement learning, hosted on TogetherAI.
DeepSeek V3 MoE model with 671B total parameters and 37B active, hosted on TogetherAI.
Free tier of DeepSeek R1 distilled to Llama 70B architecture, hosted on TogetherAI.
DeepSeek R1 reasoning model distilled to Llama 70B architecture, hosted on TogetherAI.
Meta's Llama 3.1 8B optimized for fast inference on TogetherAI.
Meta's Llama 3.1 70B optimized for fast inference on TogetherAI.
Meta's largest Llama 3.1 405B model optimized for fast inference on TogetherAI.
Alibaba's Qwen2.5 7B model optimized for fast inference on TogetherAI.
Meta's Llama 2 7B chat model for conversational AI, hosted on TogetherAI.
Mistral AI's 7B instruction-tuned model v0.1, hosted on TogetherAI.
Mistral AI's 7B instruction-tuned model v0.2 with improved performance, hosted on TogetherAI.
Meta's Llama 2 13B chat model for conversational AI, hosted on TogetherAI.
Google's Gemma 2 9B instruction-tuned model, hosted on TogetherAI.
Alibaba's QwQ reasoning model preview with enhanced thinking capabilities, hosted on TogetherAI.
Google's Gemma 2 27B instruction-tuned model, hosted on TogetherAI.
Alibaba's Qwen2.5 72B model optimized for fast inference on TogetherAI.
Mistral AI's Mixtral 8x7B MoE model instruction-tuned, hosted on TogetherAI.
Mistral AI's larger Mixtral 8x22B MoE model instruction-tuned, hosted on TogetherAI.
Black Forest Labs' fastest image generation model for quick creative outputs.
Black Forest Labs' professional-grade image generation model for high-quality outputs.
Black Forest Labs' development image generation model for experimentation and testing.
Black Forest Labs' updated professional image generation model with improved quality and consistency.
Black Forest Labs' highest quality image generation model for premium creative outputs.
xAI's advanced reasoning model with 1M token context, trained with 10x more compute than previous models on the Colossus supercluster.
Beta version of Grok 3 with advanced reasoning capabilities.
Faster, more cost-efficient version of Grok 3 for high-throughput tasks.
Beta version of Grok 3 Fast for faster responses.
Smaller, more cost-efficient version of Grok 3 with reasoning capabilities for everyday tasks.
Beta version of Grok 3 Mini for cost-efficient tasks.
Fast version of Grok 3 Mini optimized for speed and cost efficiency.
Beta version of Grok 3 Mini Fast.
Compact 1.5B parameter distilled version of DeepSeek R1 for efficient reasoning tasks.
Google's fast and efficient multimodal model with 1M token context, supporting text, image, audio, video and PDF inputs.
Snapshot of Gemini 2.0 Flash with multimodal support for text, image, audio, and video understanding.
A cost-efficient reasoning model that excels at STEM tasks, particularly science, math, and coding.
Snapshot of o3-mini from January 31, 2025. Cost-efficient reasoning model for STEM tasks.
A reasoning model designed to solve hard problems across domains. Uses chain of thought to think before responding.
Snapshot of o1 from December 17, 2024. Reasoning model that uses chain of thought to solve hard problems.
A cost-efficient audio-capable model that accepts text, audio, and image inputs and can generate text and audio outputs.
A cost-efficient audio-capable model that accepts text, audio, and image inputs and can generate text and audio outputs.
GPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.
GPT-4o Realtime Preview (2024-12-17)
DeprecatedDecember 2024 snapshot of GPT-4o realtime preview with improved latency and audio quality.
Cost-efficient version of GPT-4o supporting real-time audio and text streaming for conversational applications.
December 2024 snapshot of GPT-4o-mini realtime with optimized performance for real-time interactions.
November 2024 snapshot of GPT-4 Omni with enhanced creative writing and coding capabilities.
GPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.
GPT-4o Audio Preview (2024-10-01)
DeprecatedGPT-4o with native audio input and output capabilities for real-time speech-to-speech conversations.
GPT-4o Realtime Preview
DeprecatedPreview version of GPT-4o supporting real-time audio and text streaming for conversational applications.
GPT-4o Realtime Preview (2024-10-01)
DeprecatedOctober 2024 snapshot of GPT-4o realtime preview with enhanced audio processing capabilities.
o1 Preview
DeprecatedEarly preview of o1 reasoning model. Designed to spend more time thinking before responding on complex tasks.
o1-preview (2024-09-12)
DeprecatedSnapshot of o1-preview from September 12, 2024. Early preview of reasoning model capabilities.
Base model for fine-tuning and legacy applications, replacing the original davinci base model.
Lightweight base model for fine-tuning and simple tasks, replacing the original babbage base model.
Google's smallest Gemini model optimized for speed and cost efficiency with multimodal support.
The dynamic model used in ChatGPT, automatically updated to the latest GPT-4o snapshot.
August 2024 snapshot of GPT-4 Omni with improved structured outputs and function calling.
A fast, affordable small model for lightweight tasks with vision and text capabilities.
July 2024 snapshot of GPT-4o-mini, the initial release of OpenAI's affordable small model.
Google's fast, cost-efficient multimodal model with 1M token context for high-volume tasks.
Snapshot of Gemini 1.5 Flash with 1M token context for fast multimodal understanding.
Updated snapshot of Gemini 1.5 Flash with improved performance and 1M token context.
OpenAI's flagship multimodal model combining text and vision capabilities with GPT-4 level intelligence.
May 2024 snapshot of GPT-4 Omni, the initial release of OpenAI's flagship multimodal model.
GPT-4 Turbo with vision, featuring 128K context window and improved performance at lower cost.
April 2024 snapshot of GPT-4 Turbo with vision and extended context capabilities.
Google's mid-size multimodal model with 2M token context for text, image, audio, and video understanding.
Google's mid-size multimodal model with 2M token context for text, image, audio, and video understanding.
Snapshot of Gemini 1.5 Pro with 2M token context for multimodal understanding.
Updated snapshot of Gemini 1.5 Pro with improved performance and 2M token context.
Updated GPT-4 Turbo preview with reduced "laziness" and improved task completion.
GPT-4 Turbo preview model with 128K context window for handling longer inputs.
Legacy embedding model generating 1536-dimensional vectors for semantic search and similarity tasks.
Cost-efficient embedding model with improved performance over ada-002, supporting up to 8191 tokens.
High-performance embedding model generating 3072-dimensional vectors for advanced semantic understanding.
GPT-4 Turbo with vision capabilities for understanding and analyzing images alongside text.
GPT-3.5 Turbo variant with extended 16K token context window for longer conversations and documents.
November 2023 snapshot of GPT-3.5 Turbo with improved instruction following and JSON mode support.
OpenAI's advanced language model with superior reasoning, creativity, and complex task handling capabilities.
GPT-4 Turbo preview with 128K context, improved instruction following, and JSON mode support.
Latest DALL-E model with enhanced prompt understanding and image quality for professional-grade outputs.
GPT-3.5 model optimized for single-turn instruction following via completion API endpoint.
September 2023 snapshot of GPT-3.5 Turbo Instruct for legacy completion API use cases.
GPT-4 snapshot from June 2023 with improved function calling support. Optimized for complex reasoning tasks.
A fast, cost-effective text generation model for simple tasks and high-volume applications.
January 2025 snapshot of GPT-3.5 Turbo with various improvements and bug fixes.
Image generation model creating realistic images and art from text descriptions with improved quality.
Original DALL-E model for generating images from text descriptions. Superseded by newer versions.
GPT-4.1 is an enhanced version of GPT-4 with improved instruction following and multimodal capabilities for text and image understanding.
DeepSeek's conversational AI model for general-purpose chat and coding tasks with 128K context.