Qwen3.5 397B A17B
VerifiedOpen multimodal model for long-context text, image, and video tasks.
About Qwen3.5 397B A17B
The architecture integrates multiple modalities into a single framework for unified understanding. It handles extended sequences across text, images, and video streams without requiring separate pipelines. This unified approach simplifies workflows for complex media analysis.
Open-weight availability enables broad experimentation and fine-tuning by developers worldwide. The large context capacity supports detailed examination of lengthy content such as full videos or multi-page documents. Multimodal design improves coherence when reasoning across different data types.
Typical uses include video summarization, image-based question answering, and long-form document processing. Researchers apply it to content analysis, accessibility tools, and educational platforms. Developers integrate it into production systems where open customization is required.
Capabilities
How Qwen3.5 397B A17B compares
Qwen3.5 397B A17B (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Qwen3.5 397B A17B ranks #58 of 122
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Large-scale code repository refactoring
The model processes entire codebases within its 262144-token context to identify issues and suggest improvements across multiple files.
Video and image analysis pipelines
It performs visual question answering and multimodal understanding on video streams or image collections for content summarization and insight extraction.
Multilingual long-document reasoning
The model handles extended texts in various languages, enabling cross-lingual analysis and complex reasoning tasks without truncation.
Strengths & limitations
Strengths
- +Mixture-of-experts efficiency with large total parameters
- +Native support for text, image, and video inputs
- +Extended context window handling
- +Strong multilingual capabilities
Limitations
- –High inference cost for full model activation
- –Video context limited by practical token usage
- –Potential modality-specific performance variation
Cost calculator
Estimate what Qwen3.5 397B A17B would cost for your usage.
Based on Qwen3.5 397B A17B's $0.39/1M input · $2.34/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "qwen/qwen3.5-397b-a17b",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: qwen/qwen3.5-397b-a17b
Editor's verdict
Qwen3.5 397B A17B is Alibaba Qwen's open-weight multimodal with a 262K-token context window.
At $2.34 per 1M output tokens, it is mid-priced for its class.
As an open-weight model you can self-host it or call it through a hosted API.
Best suited to mixture-of-experts efficiency with large total parameters and native support for text, image, and video inputs.
Frequently asked questions
The model supports a context length of 262144 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Qwen models
Sibling versions in the Qwen family from Alibaba Qwen.