MiniMax-01
VerifiedProcesses over one million tokens of text and images in a single context.
About MiniMax-01
MiniMax-01 is built around a multimodal architecture that jointly processes text and images. Its context window of 1000192 tokens allows entire documents or image collections to remain in view during inference. The design prioritizes coherence across extended multimodal sequences without requiring external retrieval systems.
Because the weights are not publicly released, MiniMax retains full control over training data, safety filters, and deployment. This closed approach supports consistent performance on tasks that combine visual understanding with long-range textual reasoning. Typical uses include analyzing lengthy illustrated reports, generating stories from image sequences, and maintaining context across multi-turn visual conversations.
Capabilities
How MiniMax-01 compares
MiniMax-01 (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · MiniMax-01 ranks #42 of 155
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Extended multimodal document review
MiniMax-01 processes combined text and images across its full context length, supporting detailed analysis of lengthy reports or presentations.
Long-sequence visual reasoning
The model maintains image-text alignment while handling over one million tokens, enabling coherent interpretation of visual narratives spread across many pages.
Large-scale context retention projects
It performs long-context reasoning on massive multimodal inputs, making it suitable for tasks that require tracking details throughout extensive datasets or archives.
Strengths & limitations
Strengths
- +Exceptional handling of lengthy inputs
- +Seamless integration of vision and language
- +Strong coherence across extended contexts
- +Versatile for complex multimodal tasks
Limitations
- –Limited to text and static image modalities
- –High computational demands at maximum context
- –Slower inference with very long inputs
Cost calculator
Estimate what MiniMax-01 would cost for your usage.
Based on MiniMax-01's $0.20/1M input · $1.10/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "minimax/minimax-01",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: minimax/minimax-01
Editor's verdict
MiniMax-01 is MiniMax's proprietary multimodal with a 1000K-token context window.
At $1.10 per 1M output tokens, it is mid-priced for its class.
It is available through MiniMax's API and aggregators like OpenRouter.
Best suited to exceptional handling of lengthy inputs and seamless integration of vision and language.
Frequently asked questions
MiniMax-01 provides a context window of 1000192 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other MiniMax models
Sibling versions in the MiniMax family from MiniMax.