Skip to content
ERNIE 4.5 VL 424B A47B logo

ERNIE 4.5 VL 424B A47B

Verified

Baidu's multimodal model for integrated image and text processing.

BaiduMultimodalClosed
Vision
Model page
Updated 2026-06-15

About ERNIE 4.5 VL 424B A47B

This model belongs to Baidu's ERNIE series and combines vision and language modalities. It accepts both images and text as inputs while maintaining a substantial context capacity. The architecture remains proprietary with no open weights available.

Its design emphasizes unified processing of visual and textual data for coherent outputs. The large context window enables handling of extended documents paired with images. Users apply it in scenarios requiring joint analysis of visual content and surrounding text.

Typical usage includes content generation that references both images and documents. It suits enterprise workflows where multimodal understanding adds value without public model access.

Capabilities

Vision-language understanding
Long-context reasoning
Image analysis and description
Multimodal instruction following
Cross-modal reasoning
Text generation

How ERNIE 4.5 VL 424B A47B compares

ERNIE 4.5 VL 424B A47B (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · ERNIE 4.5 VL 424B A47B ranks #37 of 124

$1.0
Qwen3.5-35B-A3B
$1.0
Qwen3.6 35B A3B
$1.1
Qwen3.6 Flash
$1.1
Step 3.7 Flash
$1.2
MiniMax M3
$1.3
GPT-5.4 Nano
$1.3
ERNIE 4.5 VL 424B A47B
$1.3
Qwen3.7 Plus
$1.4
Qwen3 VL 8B Thinking
$1.5
Gemini 3.1 Flash Lite
$1.5
Gemini 3.1 Flash Lite Preview
$1.5
Mistral Large 3 2512
$1.5
Perceptron Mk1

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long Visual Document Analysis

Processes 131k-token inputs combining text and images for detailed reports, charts, and diagrams using cross-modal reasoning and long-context capabilities.

Image-Guided Instruction Tasks

Follows complex multimodal instructions to generate text descriptions or analyses from visual inputs in scenarios like product reviews or scene understanding.

Vision-Language Research Support

Handles cross-modal queries on extended contexts for scientific or technical materials that mix diagrams, equations, and explanatory text.

Strengths & limitations

Strengths

  • +Strong native Chinese language support
  • +Seamless image-text integration
  • +Handles 128k token contexts
  • +Large-scale multimodal architecture

Limitations

  • Subject to Chinese content regulations
  • Limited transparency on training data
  • Primarily optimized for Chinese and English

Cost calculator

Estimate what ERNIE 4.5 VL 424B A47B would cost for your usage.

$0.00104
per request
$10.45
estimated / month

Based on ERNIE 4.5 VL 424B A47B's $0.42/1M input · $1.25/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "baidu/ernie-4.5-vl-424b-a47b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: baidu/ernie-4.5-vl-424b-a47b

Editor's verdict

Our take on ERNIE 4.5 VL 424B A47B

ERNIE 4.5 VL 424B A47B is Baidu's proprietary multimodal with a 131K-token context window.

At $1.25 per 1M output tokens, it is mid-priced for its class.

It is available through Baidu's API and aggregators like OpenRouter.

Best suited to strong native chinese language support and seamless image-text integration.

Did you find this helpful?

Frequently asked questions

The model supports a context length of 131072 tokens for handling extended multimodal inputs.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Promote ERNIE 4.5 VL 424B A47B

Add this badge to your website, or share the tool.

DFeatured on DhanasviERNIE 4.5 VL 424B A47B 1