Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash
A side-by-side comparison of two llm models — real specs, pricing, strengths and weaknesses, and a clear verdict on which to choose. Kept current by our agents.
Quick verdict: which should you choose?
Choose Qwen Plus 0728 (thinking) if you need
- ✓cost-efficient high-volume inference at $0.18 per million tokens
- ✓fast output at 103.73 tokens per second with Flash optimization
- ✓maximum context length of 1,048,576 tokens for large inputs
- ✓strong coding and STEM performance in documented use cases
Choose DeepSeek V4 Flash if you need
- ✓clear step-by-step reasoning style for complex tasks
- ✓strong Chinese-English bilingual performance
- ✓effective handling of very long inputs up to 1,000,000 tokens
- ✓solid technical and coding assistance with reasoning focus
Verdict
DeepSeek V4 Flash leads on measurable efficiency metrics with a known intelligence_index of 46.5, output speed of 103.73 t/s, and price of $0.18/1M tokens versus Qwen Plus 0728 (thinking)'s $0.78/1M, plus a marginally larger 1,048,576-token context. Qwen Plus 0728 (thinking) differentiates through its documented step-by-step reasoning style and strong Chinese-English bilingual performance where DeepSeek V4 Flash provides no data. Both share open-weight text-only designs and similar context scales but lack direct intelligence or speed comparisons.
Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash: side by side
| Spec | Qwen Plus 0728 (thinking) | DeepSeek V4 Flash | Winner |
|---|---|---|---|
| Intelligence | — | 46.5 | Tie |
| Output speed | — | 104 t/s | Tie |
| Output price | $0.78/1M | $0.18/1M | DeepSeek V4 Flash |
| Context | 1000K | 1049K | DeepSeek V4 Flash |
| Params | — | — | Tie |
| Type | Open-weight | Open-weight | Tie |
| Provider | Alibaba Qwen | DeepSeek | Tie |
Detailed analysis
Pricing
Winner: DeepSeek V4 FlashDeepSeek V4 Flash is listed at $0.18 per million tokens while Qwen Plus 0728 (thinking) costs $0.78 per million tokens. This makes DeepSeek V4 Flash the clear lower-cost option for high-volume usage based on the provided pricing data.
Speed and Context
Winner: DeepSeek V4 FlashDeepSeek V4 Flash reports 103.73 tokens per second output speed and 1,048,576-token context. Qwen Plus 0728 (thinking) has no speed data and a 1,000,000-token context, giving DeepSeek the edge on known speed and a slight context advantage.
Reasoning and Language Strengths
Winner: Qwen Plus 0728 (thinking)Qwen Plus 0728 (thinking) explicitly lists strong Chinese-English bilingual performance and a clear step-by-step reasoning style. DeepSeek V4 Flash instead highlights strong coding and STEM performance without bilingual or reasoning-style details.
Intelligence Index
Winner: DeepSeek V4 FlashDeepSeek V4 Flash provides an intelligence_index of 46.5 while Qwen Plus 0728 (thinking) lists none. Direct comparison is not possible beyond this single reported value.
Qwen Plus 0728 (thinking)
Pros
- +Strong Chinese-English bilingual performance
- +Effective handling of very long inputs
- +Solid technical and coding assistance
- +Clear step-by-step reasoning style
Cons
- –Text-only modality
- –May still hallucinate on niche facts
- –Performance varies across domains
DeepSeek V4 Flash
Pros
- +Handles very large contexts effectively
- +Strong coding and STEM performance
- +Fast inference as a Flash variant
- +Cost-efficient for high-volume use
Cons
- –Text-only modality
- –May lag on nuanced creative tasks
- –Standard LLM hallucination risks
Summary: Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash
Select DeepSeek V4 Flash when price, speed, and maximum context length are priorities given its concrete metrics. Choose Qwen Plus 0728 (thinking) when bilingual capabilities or explicit step-by-step reasoning matter most. Both models remain comparable on open-weight status and text-only limitations.
Frequently asked questions
DeepSeek V4 Flash at $0.18 per million tokens versus Qwen Plus 0728 (thinking) at $0.78 per million tokens.