What is the pricing for GLM 4.5V?

Pricing details for GLM 4.5V are not specified in the available model information.

How can users access GLM 4.5V?

Specific access methods or platforms for GLM 4.5V by Z.AI are not detailed in the model description.

What are the main use cases for GLM 4.5V?

Primary uses include multimodal text and image understanding, visual question answering, image analysis, and document comprehension with visuals.

GLM 4.5V by Z.AI — Specs, Pricing, Benchmarks (2026)

About GLM 4.5V

GLM 4.5V features a context window of 65536 tokens. This supports extended multimodal inputs including detailed image analysis alongside text. The model is closed-weight with no public parameter count disclosed.

Its design emphasizes joint handling of textual and visual modalities. Strengths lie in coherent responses across combined data types without requiring separate models. Access remains restricted to authorized users through Z.AI.

Typical usage covers image captioning, visual question answering, and mixed-media document processing. Developers deploy it in systems needing unified text-image understanding. Workflows often include professional analysis and content generation tasks.

Capabilities

Multimodal text and image understanding

Long-context reasoning

Visual question answering

Image analysis and interpretation

Cross-modal reasoning

Document comprehension with visuals

How GLM 4.5V compares

GLM 4.5V (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · GLM 4.5V ranks #32 of 97

$1.5

Gemini 3.1 Flash Lite Preview

$1.5

Gemini 3.1 Flash Lite

$1.5

Perceptron Mk1

$1.6

Qwen3.5-27B

$1.6

Qwen3 VL 30B A3B Thinking

$1.8

Qwen3.5 Plus 2026-04-20

$1.8

GLM 4.5V

$1.9

Qwen3.6 Plus

$2.0

GPT-5 Mini

$2.0

GPT-5.1-Codex-Mini

$2.0

Devstral 2 2512

$2.0

Grok Build 0.1

$2.0

Seed 1.6

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Visual Question Answering

The model excels at responding to questions that combine textual queries with image inputs through its multimodal text and image understanding.

Document Analysis with Visuals

It performs well on tasks involving document comprehension that include charts, diagrams, and other visual elements alongside text.

Extended Multimodal Reasoning

The model supports long-context reasoning across text and images within its 65536-token context window for cross-modal tasks.

Strengths & limitations

Strengths

+Strong vision-language integration
+Handles extended 64k token contexts
+Effective for real-world image+text tasks
+Flexible multimodal input processing

Limitations

–Limited to text and image modalities
–Can struggle with highly complex or ambiguous visuals
–Vision performance depends on image quality and clarity

Cost calculator

Estimate what GLM 4.5V would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00150

per request

$15

estimated / month

Based on GLM 4.5V's $0.60/1M input · $1.80/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "z-ai/glm-4.5v",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: z-ai/glm-4.5v

Editor's verdict

Our take on GLM 4.5V

GLM 4.5V is Z.AI's proprietary multimodal with a 66K-token context window.

At $1.80 per 1M output tokens, it is mid-priced for its class.

It is available through Z.AI's API and aggregators like OpenRouter.

Best suited to strong vision-language integration and handles extended 64k token contexts.

Did you find this helpful?

Frequently asked questions

The model provides a context window of 65536 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Other multimodal worth comparing.

Gemini 2.5 Flash Lite Preview 09-2025

Google · Multimodal

Verified

Fast multimodal model for efficient text, image, audio, and video tasks.

Closed1049K ctx$0.40/1M out

GPT-5.5 Pro

OpenAI · Multimodal

Verified

Multimodal model handling over a million tokens of context.

Closed1050K ctx$180.00/1M out

Claude Sonnet 4.5

Anthropic · Multimodal

Verified

Anthropic's Claude Sonnet 4.5 excels at large-scale multimodal reasoning and analysis.

Closed1000K ctx$15.00/1M out

GLM 4.5V

About GLM 4.5V

Capabilities

How GLM 4.5V compares

Price

Best for

Visual Question Answering

Document Analysis with Visuals

Extended Multimodal Reasoning

Strengths & limitations

Strengths

Limitations

Cost calculator

Quick start

Editor's verdict

Frequently asked questions

What is the context length supported by GLM 4.5V?

What is the pricing for GLM 4.5V?

How can users access GLM 4.5V?

What are the main use cases for GLM 4.5V?

User reviews

Other GLM models

GLM 5 Turbo

GLM 5

GLM 4.7

GLM 4.6

GLM 4.7 Flash

GLM 4.5

Similar models

Gemini 2.5 Flash Lite Preview 09-2025

GPT-5.5 Pro

Claude Sonnet 4.5

Promote GLM 4.5V