Get instant visibility into service health and any ongoing incidents for various AI Model APIs in one place.
Click any card with a in the status indicator to visit the provider's status page.
OpenRouter
This version of Qwen 2.5 adds vision capabilities to the solid foundation of its text-only sibling. It's great for multimodal tasks like analyzing images with text, understanding screenshots, or processing documents with visual elements. A practical choice when you need both language and vision understanding without breaking the bank.
Groq
Qwen 3 brings a unique trick - it can switch between quick responses and deep thinking on the fly. Need a fast answer? It's got you. Complex reasoning? Just enable thinking mode. It's like having two models in one, optimized for both everyday chats and challenging problems.
OpenRouter
When you need raw capability without the reasoning overhead, this is your go-to. It's incredibly knowledgeable across domains and excels at tasks requiring broad understanding. Great for content creation, analysis, and general problem-solving where you don't need to see the thought process.
OpenRouter
Purpose-built for developers, this model excels at everything from quick scripts to complex system design. It understands modern development practices, can work across multiple files, and even helps with debugging. If you're building software, this is like having a senior developer as your pair programming partner.
Anthropic
Claude 4 Sonnet hits that perfect balance - significantly more capable than the 3.x series, but fast enough for everyday use. It's become the favorite for developers who need reliable, intelligent assistance without the premium cost of Opus.
Anthropic
Claude 4.5 Sonnet represents a significant leap forward in AI assistance. It's exceptionally good at real-world tasks - from writing production-ready code to creating compelling content. What sets it apart is how naturally it collaborates, almost like working with a very smart human partner.
Anthropic
Current Sonnet model tuned for strong coding, clear writing, and reliable tool use at a practical cost. It is designed to handle day-to-day product, engineering, and analysis workflows with better consistency than prior Sonnet versions.
Anthropic
The most capable Claude model for when you need the absolute best. It handles incredibly complex tasks, maintains context over long conversations, and produces exceptionally high-quality outputs. Think of it as hiring a world-class expert - expensive, but worth it for mission-critical work.
Anthropic
Don't let the 'efficient' label fool you - Haiku 4.5 is remarkably capable for its speed and cost. It's perfect for real-time applications, quick iterations, and high-volume tasks where you need quality responses without the wait.
Anthropic
Last-gen flagship Claude model. Strong at long-context reasoning, nuanced writing, and complex coding or analysis. Ideal when you need maximum reliability and can trade off cost and speed.
Anthropic
Current flagship Claude model with the best overall performance. Excels at complex reasoning, multi-step planning, long-context tasks, and tool-driven workflows while maintaining top-tier writing quality.
OpenRouter
Takes the solid foundation of DeepSeek V3 and adds months of additional training and refinement. Better at following instructions, more knowledgeable, and still maintains that direct, no-nonsense communication style that made the original popular.
OpenRouter
V3.1 is a capable general-purpose model that excels at coding, tool use, and complex problem-solving. It delivers strong performance across a wide range of tasks while maintaining good speed and efficiency.
OpenRouter
Terminus takes everything learned from V3.1 and adds stability improvements. Better at maintaining conversation context, more reliable on long tasks, and improved tool-use capabilities. It's the version you'd deploy in production systems.
OpenRouter
V3.2 introduces DeepSeek Sparse Attention, making it incredibly efficient for long-context tasks while maintaining top-tier performance. It's designed for the modern AI workflow - handling massive codebases, long documents, and complex multi-step processes with ease.
OpenRouter
DeepSeek R1 proved that open-source reasoning models could compete with the best closed models. Whether you choose the original, the distilled versions, or the latest updates, you're getting genuine reasoning capabilities that show their work and think deeply about problems.
OpenRouter
Similar to the Llama distilled model, but distilled on Qwen 32B instead. Slightly better at code, slightly more likely to fall into thought loops.
Flash 2.0 is like the Swiss Army knife of AI models - fast, reliable, and handles almost anything you throw at it. Its standout feature is the enormous context window, letting you work with entire codebases or long documents without breaking a sweat.
Takes everything great about Flash 2.0 and adds improved capabilities. Better at complex tasks while maintaining that signature Google speed. Perfect for when you need quick but thoughtful responses.
Gemini 2.5 Flash-Lite is a member of the Gemini 2.5 series of models, a suite of highly-capable, natively multimodal models. Gemini 2.5 Flash-Lite is Google’s most cost-efficient model, striking a balance between efficiency and quality.
Gemini 2.5 Flash Image Preview is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations.
Similar to 2.0 Flash, but even faster. Not as smart, but still good at most things.
When you need Google's most capable model, 2.5 Pro delivers. It excels at mathematical reasoning, scientific analysis, and complex coding challenges. The thinking capabilities make it particularly valuable for research and development work.
Google's Imagen 4 is a powerful image generation model that creates high-quality, photorealistic images from text prompts. Built on advanced diffusion techniques and trained on diverse datasets. 2 images per prompt.
Google's Imagen 4 Ultra is a powerful image generation model that creates high-quality, photorealistic images from text prompts. Built on advanced diffusion techniques and trained on diverse datasets. 1 image per prompt.
Gemini 3 Pro represents Google's previous advances in AI. It's exceptionally good at complex reasoning tasks, long-context understanding, and maintaining coherence over extended conversations. Think of it as Google's answer to the most challenging cognitive tasks.
Gemini 3.1 Pro Preview is Google's frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation of the Gemini 3 series, it combines high-precision reasoning across text, image, video, audio, and code with a 1M-token context window.
Nano Banana Pro is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations.
The fastest model in Google's Gemini 3 family, but don't confuse speed with simplicity. It handles complex tasks remarkably well while maintaining low latency. Perfect for real-time applications and rapid prototyping.
Gemini 3.1 Flash Lite is built for high-throughput, low-latency tasks where cost efficiency matters most. It is optimized for common agent flows, prompt chaining, and fast multimodal interactions.
OpenRouter
Gemma 4 26B A4B is Google's instruction-tuned mixture-of-experts model that delivers strong open-model quality while keeping inference costs low. It supports multimodal inputs, tool use, and an optional reasoning mode, making it a practical pick for everyday analysis, coding, and document or image-aware workflows.
Groq
If speed is your priority, Llama 3.3 70B is hard to beat. It processes tokens at incredible rates while maintaining solid performance on most tasks. Think of it as the sports car of AI models - not the most luxurious, but incredibly fun to drive.
Groq
Scout brings vision capabilities to the Llama family while maintaining efficiency. It's designed for applications that need to understand both text and images without the computational overhead of larger models.
OpenRouter
Maverick excels at maintaining natural, coherent conversations across long contexts. It's particularly good at tasks requiring nuanced understanding and can handle both text and visual inputs with impressive capability.
OpenRouter
MiniMax M2 redefines what's possible with efficient AI design. Despite 'only' activating 10B parameters from its 230B total, it delivers performance that rivals much larger models. It's specifically optimized for coding workflows, making it perfect for development tools.
OpenRouter
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency. Compared to its predecessor, M2.1 delivers cleaner, more concise outputs and faster perceived response times. It shows leading multilingual coding performance across major systems and application languages.
OpenRouter
MiniMax-M2.5 is a high-efficiency large language model optimized for end-to-end coding and productivity. Building on MiniMax-M2, it extends strong coding and tool-use performance into broader office tasks, while handling multi-software context switching and cross-team collaboration across human and agent workflows. It reports strong benchmark performance, including 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp, while improving token efficiency through planning-focused training.
OpenRouter
MiniMax-M2.7 is a next-generation large language model built for autonomous, real-world productivity and continuous improvement. It emphasizes agentic task execution through multi-agent collaboration, with strong performance across coding, debugging, root cause analysis, and document-heavy workflows.
OpenRouter
MiMo v2 Flash is Xiaomi's speed-optimized model for quick responses, lightweight coding help, and everyday productivity tasks where low latency matters most.
OpenRouter
MiMo v2 Pro is Xiaomi's flagship foundation model, tuned for agentic workflows, production engineering tasks, and long-horizon reasoning. On OpenRouter it ships with a roughly 1M-token context window, strong tool use, and pricing that fits high-capability coding and orchestration work.
OpenRouter
MiMo v2 Omni is Xiaomi's frontier omni-modal model for multimodal perception, visual grounding, multi-step planning, tool use, and code execution. It's a good fit when a workflow mixes text with image understanding and broader agent behavior.
OpenRouter
Kimi K2 proved that Chinese AI companies could build world-class open models. It's particularly strong on coding tasks, mathematical problem-solving, and workflows. The 1 trillion parameter count sounds impressive, but the real magic is in how efficiently it uses them.
OpenRouter
The September update brings improved capabilities and extended context length. Better at handling long documents, maintaining conversation coherence, and executing complex multi-step tasks.
OpenRouter
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability. Built on Kimi K2 with continued pretraining over approximately 15T mixed visual and text tokens, it delivers strong performance in general reasoning, visual coding, and agentic tool-calling.
OpenRouter
A medium-sized open-weight model from OpenAI suitable for general-purpose tasks. gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI's Harmony response format and supports reasoning level configuration, fine-tuning, and capabilities including function calling, tool use, and structured outputs.
OpenRouter
A large open-weight model from OpenAI. gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.
OpenAI
Like gpt-4o, but faster. This model sacrifices some of the original GPT-4o's precision for significantly reduced latency. It accepts both text and image inputs.
OpenAI
GPT-4.1 brings significant improvements in code generation, instruction following, and complex reasoning. It's particularly good at software development tasks and maintains coherence over long contexts.
OpenAI
GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency. It has a very large context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider's polyglot diff benchmark) and vision understanding.
OpenAI
For tasks that demand low latency, GPT‑4.1 nano is the fastest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It's ideal for tasks like classification or autocompletion.
OpenAI
GPT-5 represents a significant leap in AI capability. It excels at professional tasks, complex problem-solving, and maintaining natural conversation. The different variants (Instant, Thinking, Pro) let you choose the right tool for your specific needs.
OpenAI
A lighter-weight GPT-5 variant optimized for speed while retaining strong reasoning and tool use.
OpenAI
An ultra-fast GPT-5 variant tuned for low-latency tasks with reasoning and tool use.
OpenAI
Building on GPT-5's foundation, 5.1 brings enhanced capabilities and better performance on complex tasks. The different variants let you choose the right tool for your specific needs.
OpenAI
GPT-5.2 achieves something special - it's both faster and smarter than its predecessors. It excels at specialized knowledge work while maintaining conversational warmth. The speed improvements make it practical for real-time applications.
OpenAI
GPT-5.2 achieves something special - it's both faster and smarter than its predecessors. It excels at specialized knowledge work while maintaining conversational warmth. The speed improvements make it practical for real-time applications.
OpenAI
GPT-5.3 delivers improved quality over GPT-5.2 while preserving low-latency responses. It is designed for fast chat workflows, tool use, and broad real-world utility.
OpenAI
GPT-5.4 is the fast variant tuned for low-latency chat and tool use. For search and conversational workloads, this non-reasoning variant maps to reasoning effort none.
OpenAI
A lightweight GPT-5.4 variant optimized for quick responses while keeping strong reasoning and tool-use performance for day-to-day workflows.
OpenAI
An ultra-fast GPT-5.4 variant for lightweight tasks, quick tool orchestration, and high-throughput chat experiences.
OpenAI
The o3 family represents OpenAI's focus on systematic reasoning. These models excel at mathematical problems, scientific analysis, and multi-step reasoning tasks. They're designed for when you need to think through problems methodically.
OpenAI
Proves that you don't need massive size for sophisticated reasoning. o4-mini delivers impressive analytical capabilities while maintaining reasonable speed and cost. Perfect for applications that need reasoning power without the overhead.
OpenAI
The o3 family represents OpenAI's focus on systematic reasoning. These models excel at mathematical problems, scientific analysis, and multi-step reasoning tasks. They're designed for when you need to think through problems methodically.
OpenAI
When you encounter problems that require deep, systematic analysis, o3 Pro is your best bet. It uses additional compute time to work through complex challenges methodically, often achieving results that surprise even AI researchers.
OpenAI
OpenAI's previous image generation model, using lots of crazy tech like custom tools for text and reflections. This model generates 1 image per prompt.
OpenAI
OpenAI's latest and greatest image generation model, using lots of crazy tech like custom tools for text and reflections. This model generates 1 image per prompt.
OpenRouter
GLM-4.5 is an open-weight MoE model that competes with o3 and Claude 4 while being smaller and stronger than DeepSeek-R1 and Kimi K2. It excels at coding and is trained using the Muon architecture, the same one to train Kimi K2.
OpenRouter
GLM-4.5V is an open-weight MoE model that competes with o3 and Claude 4 while being smaller and stronger than DeepSeek-R1 and Kimi K2. It excels at coding and is trained using the Muon architecture, the same one to train Kimi K2.
OpenRouter
GLM-4.5-Air is the lightweight variant of GLM-4.5, an open-weight MoE model that competes with o3 and Claude 4 while being smaller and stronger than DeepSeek-R1 and Kimi K2. It excels at coding and is trained using the Muon architecture, the same one to train Kimi K2.
OpenRouter
Compared with GLM-4.5, this generation brings several key improvements - Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced problem-solving: GLM-4.6 shows a clear improvement in analytical performance and supports tool use during inference, leading to stronger overall capability. Enhanced tool use: GLM-4.6 exhibits stronger performance in tool using and search-based workflows, and integrates more effectively within frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.
OpenRouter
GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts and charts directly as visual inputs, and integrates native multimodal function calling to connect perception with downstream tool execution. The model also enables interleaved image-text generation and UI reconstruction workflows, including screenshot-to-HTML synthesis and iterative visual editing.
OpenRouter
GLM-4.7 is Z.AI's latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step execution. It demonstrates significant improvements in executing complex tasks while delivering more natural conversational experiences and superior front-end aesthetics.
OpenRouter
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading closed-source models. With advanced agentic planning, deep backend reasoning, and iterative self-correction, GLM-5 moves beyond code generation to full-system construction and autonomous execution.
OpenRouter
GLM-5.1 delivers a major leap in coding capability, with especially strong performance on long-horizon engineering tasks. It is designed to plan, execute, and iteratively improve work over extended sessions, making it well-suited for autonomous coding workflows, complex systems work, and production-grade software delivery.
OpenRouter
GLM-5V Turbo is a lightweight multimodal member of the GLM-5 family optimized for low-latency visual understanding, document analysis, and screenshot-based assistant tasks. It is designed to pair strong image comprehension with practical tool usage for everyday chat and productivity workflows.
OpenRouter
xAI's flagship model that breaks records on lots of benchmarks (allegedly). Possesses deep domain knowledge in finance, healthcare, law, and science.
OpenRouter
xAI's flagship model that excels at data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science.
OpenRouter
A lightweight model that thinks before responding. Great for simple or logic-based tasks that do not require deep domain knowledge.
OpenRouter
Grok 4.20 is xAI's newest flagship model with a 200K-token context window, strong tool use, and multimodal support. It stays fast by default while still letting you enable deeper reasoning when you need more deliberate analysis.
OpenRouter
Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window.
OpenRouter
Grok 4.1 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window.
Powered by T3 Chat