Grok 4.20 Multi-Agent
VerifiedMulti-agent multimodal model for massive context tasks
About Grok 4.20 Multi-Agent
The architecture combines multiple specialized agents that divide subtasks while sharing the full multimodal context. Vision and language pathways operate together to interpret documents containing both text and embedded images. The 2-million-token window supports ingestion of very large file collections without truncation.
Strengths center on maintaining coherence across extended multimodal sequences and routing subtasks among agents for structured outputs. Because weights remain closed, access occurs exclusively through xAI-hosted inference. This design favors reliability in production environments over local deployment.
Common applications include analyzing lengthy research reports that mix text, diagrams, and data files. Teams use it for iterative multi-step planning where agents refine answers across rounds. Integration typically occurs in pipelines that require both high-capacity context and coordinated reasoning steps.
Capabilities
How Grok 4.20 Multi-Agent compares
Grok 4.20 Multi-Agent (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Grok 4.20 Multi-Agent ranks #58 of 97
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long multimodal document analysis
The model processes combined text and visual inputs across millions of tokens, enabling synthesis of extensive reports, research papers, and image-rich datasets in one pass.
Multi-agent workflow orchestration
It manages coordinated interactions among multiple specialized agents while retaining full conversation history and shared multimodal context.
Extended video and image reasoning
Users can feed hours of video frames or large image collections and receive coherent analysis that references details from the entire sequence.
Strengths & limitations
Strengths
- +Supports extremely long contexts
- +Coordinates multiple agents for workflows
- +Handles text, images, and files natively
Limitations
- –Multi-agent setups may add latency
- –Coordination overhead on simple tasks
- –No audio or video modalities
Pricing by provider
Live per-provider pricing & uptime, routed via OpenRouter. Prices are USD per 1M tokens.
| Provider | Input /1M | Output /1M | Context | Uptime |
|---|---|---|---|---|
| xAI | $2.00 | $6.00 | 2000K | — |
Cost calculator
Estimate what Grok 4.20 Multi-Agent would cost for your usage.
Based on Grok 4.20 Multi-Agent's $2.00/1M input · $6.00/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "x-ai/grok-4.20-multi-agent",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: x-ai/grok-4.20-multi-agent
Editor's verdict
Grok 4.20 Multi-Agent is xAI's proprietary multimodal with a 2000K-token context window.
At $6.00 per 1M output tokens, it is premium-priced for its class, served by 1 provider.
It is available through xAI's API and aggregators like OpenRouter.
Best suited to supports extremely long contexts and coordinates multiple agents for workflows.
Frequently asked questions
The model provides a context window of 2,000,000 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Grok models
Sibling versions in the Grok family from xAI.