Mistral Medium 3
VerifiedOpen-weight multimodal model for text, image, and file tasks.
About Mistral Medium 3
Mistral Medium 3 uses a multimodal design that integrates text, visual, and file processing in a single model. Its large context window accommodates extended inputs such as lengthy documents paired with images. The open-weight availability allows researchers and developers to inspect, modify, and host the model locally.
Key strengths lie in handling mixed-modality queries without closed-source restrictions. It maintains coherence across long conversations or multi-page files that include visual elements. This combination makes it suitable for tasks requiring both language understanding and image interpretation.
Common applications include document summarization with charts, visual question answering, and automated analysis of mixed media collections. Teams deploy it in research prototypes or production pipelines where transparency and customization matter. Its architecture supports fine-tuning on domain-specific multimodal datasets.
Capabilities
How Mistral Medium 3 compares
Mistral Medium 3 (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Mistral Medium 3 ranks #61 of 139
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long document analysis with embedded images
The model processes files and documents up to 131072 tokens that contain both text and visuals, supporting detailed reasoning across the combined inputs.
Vision-language reasoning tasks
It performs multimodal understanding by interpreting images alongside text for tasks such as scene description and visual question answering within extended contexts.
Complex instruction following over large inputs
Users can apply the model for general text generation and instruction following that incorporates long textual sequences with image references.
Strengths & limitations
Strengths
- +Native multimodal integration
- +Large 128k context window
- +Flexible input modalities including files
Limitations
- –Medium-tier model may trail larger flagships on complex reasoning
- –Performance depends on prompt quality for edge cases
Cost calculator
Estimate what Mistral Medium 3 would cost for your usage.
Based on Mistral Medium 3's $0.40/1M input · $2.00/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "mistralai/mistral-medium-3",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: mistralai/mistral-medium-3
Editor's verdict
Mistral Medium 3 is Mistral's proprietary multimodal with a 131K-token context window.
At $2.00 per 1M output tokens, it is mid-priced for its class.
It is available through Mistral's API and aggregators like OpenRouter.
Best suited to native multimodal integration and large 128k context window.
Frequently asked questions
The model provides a context window of 131072 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Mistral models
Sibling versions in the Mistral family from Mistral.