Is Qwen3.5 397B A17B multimodal?

Yes, it is a multimodal model capable of video and image analysis plus visual question answering.

Who developed the Qwen3.5 397B A17B model?

It was developed by Alibaba Qwen.

What are the primary capabilities of this model?

Key capabilities include long-context reasoning, code generation, multilingual processing, and multimodal understanding.

How can users access Qwen3.5 397B A17B?

Access is provided through Alibaba Qwen's official model release channels and APIs.

Qwen3.5 397B A17B

Verified

Open multimodal model for long-context text, image, and video tasks.

Alibaba QwenMultimodalOpen

Vision

Model page

Updated 2026-06-14

About Qwen3.5 397B A17B

The architecture integrates multiple modalities into a single framework for unified understanding. It handles extended sequences across text, images, and video streams without requiring separate pipelines. This unified approach simplifies workflows for complex media analysis.

Open-weight availability enables broad experimentation and fine-tuning by developers worldwide. The large context capacity supports detailed examination of lengthy content such as full videos or multi-page documents. Multimodal design improves coherence when reasoning across different data types.

Typical uses include video summarization, image-based question answering, and long-form document processing. Researchers apply it to content analysis, accessibility tools, and educational platforms. Developers integrate it into production systems where open customization is required.

Capabilities

Long-context reasoning

Multimodal understanding

Code generation

Multilingual processing

Video and image analysis

Visual question answering

How Qwen3.5 397B A17B compares

Qwen3.5 397B A17B (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Qwen3.5 397B A17B ranks #58 of 122

$2.0

Grok Build 0.1

$2.0

Seed 1.6

$2.0

Seed-2.0-Lite

$2.0

Mistral Medium 3.1

$2.0

Kimi K2.5

$2.1

Qwen3.5-122B-A10B

$2.3

Qwen3.5 397B A17B

$2.5

Grok 4.20

$2.5

Grok 4.3

$2.5

Nova 2 Lite

$2.6

Qwen3 VL 235B A22B Thinking

$3.0

Gemini 3 Flash Preview

$3.2

Qwen3.6 27B

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Large-scale code repository refactoring

The model processes entire codebases within its 262144-token context to identify issues and suggest improvements across multiple files.

Video and image analysis pipelines

It performs visual question answering and multimodal understanding on video streams or image collections for content summarization and insight extraction.

Multilingual long-document reasoning

The model handles extended texts in various languages, enabling cross-lingual analysis and complex reasoning tasks without truncation.

Strengths & limitations

Strengths

+Mixture-of-experts efficiency with large total parameters
+Native support for text, image, and video inputs
+Extended context window handling
+Strong multilingual capabilities

Limitations

–High inference cost for full model activation
–Video context limited by practical token usage
–Potential modality-specific performance variation

Cost calculator

Estimate what Qwen3.5 397B A17B would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00156

per request

$15.6

estimated / month

Based on Qwen3.5 397B A17B's $0.39/1M input · $2.34/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "qwen/qwen3.5-397b-a17b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: qwen/qwen3.5-397b-a17b

Editor's verdict

Our take on Qwen3.5 397B A17B

Qwen3.5 397B A17B is Alibaba Qwen's open-weight multimodal with a 262K-token context window.

At $2.34 per 1M output tokens, it is mid-priced for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to mixture-of-experts efficiency with large total parameters and native support for text, image, and video inputs.

Did you find this helpful?

Frequently asked questions

The model supports a context length of 262144 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Qwen models

Sibling versions in the Qwen family from Alibaba Qwen.

Qwen3.7 Max

Alibaba Qwen · Language Models

Verified

Qwen3.7 Max processes up to one million tokens in a single pass.

OpenII 56.61000K ctx$3.75/1M out

Qwen3.7 Plus

Alibaba Qwen · Multimodal

Verified

Open-weight multimodal model for million-token text and image tasks.

OpenII 53.31000K ctx$1.28/1M out

Qwen3.6 Max Preview

Alibaba Qwen · Language Models

Verified

Open-weight LLM optimized for long-context text reasoning and analysis.

OpenII 51.8262K ctx$6.24/1M out

Qwen3.6 27B

Alibaba Qwen · Multimodal

Verified

Multimodal model for long-context text, image, and video processing.

OpenII 45.8262K ctx$3.17/1M out

Qwen3.6 35B A3B

Alibaba Qwen · Multimodal

Verified

Multimodal model for long-context text, image, and video analysis.

OpenII 43.5262K ctx$1.00/1M out

Qwen3.5 Plus 2026-04-20

Alibaba Qwen · Multimodal

Verified

Open-weight multimodal model for long-context text, image, and video tasks.

Open1000K ctx$1.80/1M out

Promote Qwen3.5 397B A17B

Add this badge to your website, or share the tool.

DFeatured on DhanasviQwen3.5 397B A17B 1

Qwen3.5 397B A17B

About Qwen3.5 397B A17B

Capabilities