Does Mistral Small 4 support image inputs?

Yes, it provides vision-language understanding, image analysis, and multimodal input processing.

How do I access Mistral Small 4?

The model is offered by Mistral and can be accessed via their official API or platform.

What is the pricing for Mistral Small 4?

Current pricing details and usage tiers are listed on the Mistral AI website.

What tasks is Mistral Small 4 best suited for?

It excels at long-context reasoning, extended text generation, and cross-modal instruction following involving both text and images.

Mistral Small 4

Verified

Open-weight multimodal model for long-context text and image tasks.

MistralMultimodalOpen

Function callingJSON modeStructured outputsReasoningVision

Model page

Updated 2026-06-14

About Mistral Small 4

Mistral Small 4 uses a multimodal architecture that integrates text and vision capabilities. Its 262144-token context window enables processing of lengthy documents paired with images. The model is released as open weights, allowing broad access and customization.

Its primary strengths lie in handling extended multimodal sequences without truncation. This design supports coherent reasoning across large amounts of combined textual and visual data. Users benefit from its open-weight availability for local deployment and fine-tuning.

Typical usage includes analyzing long reports with embedded visuals, processing image-rich conversations, and building applications that require sustained context across modalities. Developers often integrate it into systems needing both vision and language understanding over extended inputs.

Capabilities

Long-context reasoning

Vision-language understanding

Multimodal input processing

Extended context text generation

Image analysis and description

Cross-modal instruction following

How Mistral Small 4 compares

Mistral Small 4 (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Mistral Small 4 ranks #10 of 63

$0.28

MiMo-V2.5

$0.30

Seed 1.6 Flash

$0.30

Voxtral Small 24B 2507

$0.40

Gemini 2.5 Flash Lite Preview 09-2025

$0.40

Seed-2.0-Mini

$0.42

Qwen3 VL 32B Instruct

$0.60

Mistral Small 4

$0.88

Qwen3 VL 235B A22B Instruct

$0.90

GLM 4.6V

$0.97

Qwen3.6 35B A3B

$1.1

Qwen3.6 Flash

$1.1

Step 3.7 Flash

$1.2

MiniMax M3

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-document image analysis

The model processes images embedded within extremely long texts, delivering detailed descriptions and insights while maintaining coherence across 262144 tokens.

Extended multimodal reasoning

It handles complex cross-modal instructions that combine vision and language inputs over large contexts, supporting tasks like summarizing illustrated reports or technical manuals.

Vision-language content generation

Users can provide images and lengthy textual prompts to generate extended, contextually accurate responses that integrate visual understanding with detailed text output.

Strengths & limitations

Strengths

+Very large 256k token context window
+Native text and image support
+Efficient multimodal architecture

Limitations

–Small model size may limit depth on complex tasks
–Supports only text and image modalities
–No audio or video capabilities

Pricing by provider

Live per-provider pricing & uptime, routed via OpenRouter. Prices are USD per 1M tokens.

Provider	Input /1M	Output /1M	Context	Uptime
Mistral	$0.15	$0.60	262K	100.0%
Venice(fp8)	$0.19	$0.75	256K	100.0%

Cost calculator

Estimate what Mistral Small 4 would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00045

per request

$4.5

estimated / month

Based on Mistral Small 4's $0.15/1M input · $0.60/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "mistralai/mistral-small-2603",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: mistralai/mistral-small-2603

Editor's verdict

Our take on Mistral Small 4

Mistral Small 4 is Mistral's open-weight multimodal with a 262K-token context window.

At $0.60 per 1M output tokens, it is very cost-efficient for its class, served by 2 providers.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to very large 256k token context window and native text and image support.

Did you find this helpful?

Frequently asked questions

The model supports a context window of 262144 tokens for handling extended inputs and outputs.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Mistral models

Sibling versions in the Mistral family from Mistral.

Mistral Medium 3.5

Mistral · Multimodal

Verified

Mistral's closed multimodal model for long-context text, image, and file tasks.

ClosedII 39.2262K ctx$7.50/1M out

Promote Mistral Small 4

Add this badge to your website, or share the tool.

DFeatured on DhanasviMistral Small 4 1

Mistral Small 4

About Mistral Small 4

Capabilities