GLM 5.2 processes million-token contexts for demanding text tasks.
GLM 5.2 is built as a closed-source LLM by Z.AI. Its architecture supports an exceptionally large context window of 1048576 tokens while remaining limited to text input and output. Parameter count details are not disclosed.
The model is suited for workloads that involve lengthy inputs such as full books, extended codebases, or multi-turn dialogues. Because it is not open-weight, access occurs through Z.AI's hosted API rather than local deployment.
Typical usage includes summarization of large documents, retrieval-augmented generation over long corpora, and complex reasoning chains that benefit from broad context retention.
Independent evaluation scores and measured speed.
Source: Artificial Analysis
GLM 5.2 (striped bar) vs other language models on intelligence, speed and price.
Artificial Analysis Intelligence Index · Higher is better · GLM 5.2 ranks #1 of 69
Output tokens per second · Higher is better · GLM 5.2 ranks #21 of 46
USD per 1M output tokens · Lower is better · GLM 5.2 ranks #125 of 147
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Processes and reasons over full-length books, research papers, or extensive datasets within its 1M token context for accurate insights.
Generates, debugs, and refactors complex codebases while following detailed multi-step instructions across programming languages.
Performs multilingual text processing, summarization, and analysis for translating and adapting materials across languages with logical consistency.
Live per-provider pricing & uptime, routed via OpenRouter. Prices are USD per 1M tokens.
| Provider | Input /1M | Output /1M | Context | Uptime |
|---|---|---|---|---|
| Wafer(fp4) | $1.20 | $3.20 | 203K | 78.9% |
| DeepInfra(fp4) | $1.20 | $4.20 | 1049K | 75.4% |
| Phala | $1.40 | $4.40 | 524K | 91.8% |
| Cloudflare | $1.40 | $4.40 | 262K | 99.6% |
| Fireworks | $1.40 | $4.40 | 1049K | 99.3% |
| Z.AI(fp8) | $1.40 | $4.40 | 1049K | 99.7% |
| Friendli | $1.40 | $4.40 | 1049K | 99.8% |
| Parasail(fp8) | $1.40 | $4.40 | 1049K | 88.9% |
| Novita(fp8) | $1.40 | $4.40 | 1049K | 99.8% |
| AtlasCloud(fp8) | $1.40 | $4.40 | 203K | 96.6% |
| StreamLake | $1.40 | $4.40 | 1024K | 96.9% |
| Io Net(fp8) | $1.68 | $5.28 | 262K | 90.1% |
Estimate what GLM 5.2 would cost for your usage.
Based on GLM 5.2's $1.20/1M input · $3.20/1M output. Estimate only — actual cost varies by provider and caching.
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "z-ai/glm-5.2",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: z-ai/glm-5.2
GLM 5.2 is Z.AI's proprietary language models with a 1049K-token context window.
On independent testing it scores 51.1 on the Artificial Analysis Intelligence Index, running at roughly 101 tokens per second with about 2.19s to first token.
At $3.20 per 1M output tokens, it is mid-priced for its class, served by 12 providers.
It is available through Z.AI's API and aggregators like OpenRouter.
Best suited to supports very large context windows and strong performance on extended document tasks.
GLM 5.2 provides a context window of 1048576 tokens.
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Sibling versions in the GLM family from Z.AI.