Qwen Plus 0728 (thinking)
VerifiedHandles complex reasoning across one million tokens of context.
About Qwen Plus 0728 (thinking)
Built on the Qwen architecture, this release provides publicly available weights for local or cloud deployment. Its one-million-token context window allows processing of entire books, large codebases, or lengthy documents in a single pass. The added thinking designation indicates optimization for structured, multi-step reasoning chains.
Strengths include retention of information over very long sequences and coherent generation across extended outputs. Because the weights are open, developers can fine-tune the model for domain-specific tasks without vendor lock-in. Text-only modality keeps inference efficient compared with multimodal variants.
Typical usage covers research synthesis, legal document review, software engineering with large repositories, and any workflow that benefits from maintaining context across tens or hundreds of thousands of tokens.
Capabilities
How Qwen Plus 0728 (thinking) compares
Qwen Plus 0728 (thinking) (striped bar) vs other language models on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Qwen Plus 0728 (thinking) ranks #29 of 63
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long-document analysis
The 1M-token context window enables processing and reasoning over entire books, code repositories, or lengthy reports in a single pass.
Complex multi-step reasoning
The 'thinking' variant supports extended chain-of-thought processes for tasks like advanced math, logic puzzles, or strategic planning.
Multilingual enterprise workflows
Qwen models excel at Chinese-English bilingual tasks such as translating technical documentation or handling cross-language customer support at scale.
Strengths & limitations
Strengths
- +Strong Chinese-English bilingual performance
- +Effective handling of very long inputs
- +Solid technical and coding assistance
- +Clear step-by-step reasoning style
Limitations
- –Text-only modality
- –May still hallucinate on niche facts
- –Performance varies across domains
Cost calculator
Estimate what Qwen Plus 0728 (thinking) would cost for your usage.
Based on Qwen Plus 0728 (thinking)'s $0.26/1M input · $0.78/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "qwen/qwen-plus-2025-07-28:thinking",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: qwen/qwen-plus-2025-07-28:thinking
Editor's verdict
Qwen Plus 0728 (thinking) is Alibaba Qwen's open-weight language models with a 1000K-token context window.
At $0.78 per 1M output tokens, it is very cost-efficient for its class.
As an open-weight model you can self-host it or call it through a hosted API.
Best suited to strong chinese-english bilingual performance and effective handling of very long inputs.
Frequently asked questions
Qwen Plus 0728 (thinking) supports up to 1,000,000 tokens of context.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Qwen models
Sibling versions in the Qwen family from Alibaba Qwen.