How do I access Qwen3 32B?

It is available via Alibaba's ModelScope platform and associated API endpoints.

What are the pricing details for Qwen3 32B?

Usage-based pricing is published on the official Alibaba Cloud site and varies by volume and deployment type.

Is Qwen3 32B suitable for instruction-following tasks?

Yes, its listed capabilities include strong instruction following for structured outputs and multi-step workflows.

Can Qwen3 32B be used for document summarization?

Its long-context and summarization strengths make it appropriate for condensing large documents while preserving key details.

Qwen3 32B

Verified

Open-weight LLM built for long-context text understanding and generation.

Alibaba QwenLanguage ModelsOpen

Model page

Updated 2026-06-15

About Qwen3 32B

Qwen3 32B is an open-weight large language model from Alibaba's Qwen team. It operates exclusively in the text modality and supports sequences up to 131072 tokens long. The design emphasizes accessibility for local fine-tuning and inference.

Its open-weight release enables broad experimentation and adaptation across different hardware setups. The extended context window helps maintain coherence over lengthy inputs such as documents or multi-turn dialogues. This combination supports tasks that require sustained attention to detail.

Users commonly deploy the model for content generation, summarization, and analytical workflows. Researchers integrate it into pipelines where transparency and customization are priorities. The weights can be run on-premises or through compatible inference frameworks.

Capabilities

Long-context reasoning

Code generation and debugging

Multilingual text understanding

Mathematical problem solving

Instruction following

Document summarization

How Qwen3 32B compares

Qwen3 32B (striped bar) vs other language models on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Qwen3 32B ranks #19 of 98

$0.19

Qwen3 30B A3B Instruct 2507

$0.20

Nemotron 3 Nano 30B A3B

$0.20

Reka Flash 3

$0.21

Hy3 preview

$0.24

Qwen3 14B

$0.27

Qwen3 Coder 30B A3B Instruct

$0.28

Qwen3 32B

$0.30

MiMo-V2-Flash

$0.30

Step 3.5 Flash

$0.30

gpt-oss-safeguard-20b

$0.34

DeepSeek V3.2

$0.35

Phi 4 Mini Instruct

$0.40

GLM 4.7 Flash

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-Context Document Analysis

Processes and summarizes documents spanning up to 131072 tokens, enabling thorough review of lengthy reports, legal texts, or research collections in a single pass.

Code Generation and Debugging

Generates, refines, and debugs code across languages while following detailed instructions, supporting full software development workflows from prototype to troubleshooting.

Multilingual Technical Problem Solving

Solves mathematical problems and handles technical queries in multiple languages, making it effective for cross-border academic work or engineering documentation.

Strengths & limitations

Strengths

+Strong reasoning for model size
+Effective long-context handling
+Solid coding and math performance
+Good multilingual coverage

Limitations

–Text-only modality
–May hallucinate on niche topics
–Requires significant compute for inference

Cost calculator

Estimate what Qwen3 32B would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00022

per request

$2.2

estimated / month

Based on Qwen3 32B's $0.08/1M input · $0.28/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "qwen/qwen3-32b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: qwen/qwen3-32b

Editor's verdict

Our take on Qwen3 32B

Qwen3 32B is Alibaba Qwen's open-weight language models with a 131K-token context window.

At $0.28 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to strong reasoning for model size and effective long-context handling.

Did you find this helpful?

Frequently asked questions

The model provides a context window of 131072 tokens for handling extended inputs.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Qwen models

Sibling versions in the Qwen family from Alibaba Qwen.

Qwen3.7 Max

Alibaba Qwen · Language Models

Verified

Qwen3.7 Max processes up to one million tokens in a single pass.

OpenII 56.61000K ctx$3.75/1M out

Qwen3.7 Plus

Alibaba Qwen · Multimodal

Verified

Open-weight multimodal model for million-token text and image tasks.

OpenII 53.31000K ctx$1.28/1M out

Qwen3.6 Max Preview

Alibaba Qwen · Language Models

Verified

Open-weight LLM optimized for long-context text reasoning and analysis.

OpenII 51.8262K ctx$6.24/1M out

Qwen3.6 27B

Alibaba Qwen · Multimodal

Verified

Multimodal model for long-context text, image, and video processing.

OpenII 45.8262K ctx$2.00/1M out

Qwen3.6 35B A3B

Alibaba Qwen · Multimodal

Verified

Multimodal model for long-context text, image, and video analysis.

OpenII 43.5262K ctx$0.97/1M out

Qwen3.5-Flash

Alibaba Qwen · Multimodal

Verified

Fast open-weight multimodal model for million-token text, image, and video tasks.

Open1000K ctx$0.26/1M out

Promote Qwen3 32B

Add this badge to your website, or share the tool.

DFeatured on DhanasviQwen3 32B 1

Qwen3 32B

About Qwen3 32B

Capabilities