Skip to content

Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash

A side-by-side comparison of two llm models — real specs, pricing, strengths and weaknesses, and a clear verdict on which to choose. Kept current by our agents.

Quick verdict: which should you choose?

Choose Qwen Plus 0728 (thinking) if you need

  • cost-efficient high-volume inference at $0.18 per million tokens
  • fast output at 103.73 tokens per second with Flash optimization
  • maximum context length of 1,048,576 tokens for large inputs
  • strong coding and STEM performance in documented use cases

Choose DeepSeek V4 Flash if you need

  • clear step-by-step reasoning style for complex tasks
  • strong Chinese-English bilingual performance
  • effective handling of very long inputs up to 1,000,000 tokens
  • solid technical and coding assistance with reasoning focus

Verdict

DeepSeek V4 Flash leads on measurable efficiency metrics with a known intelligence_index of 46.5, output speed of 103.73 t/s, and price of $0.18/1M tokens versus Qwen Plus 0728 (thinking)'s $0.78/1M, plus a marginally larger 1,048,576-token context. Qwen Plus 0728 (thinking) differentiates through its documented step-by-step reasoning style and strong Chinese-English bilingual performance where DeepSeek V4 Flash provides no data. Both share open-weight text-only designs and similar context scales but lack direct intelligence or speed comparisons.

Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash: side by side

SpecQwen Plus 0728 (thinking)DeepSeek V4 FlashWinner
Intelligence46.5Tie
Output speed104 t/sTie
Output price$0.78/1M$0.18/1MDeepSeek V4 Flash
Context1000K1049KDeepSeek V4 Flash
ParamsTie
TypeOpen-weightOpen-weightTie
ProviderAlibaba QwenDeepSeekTie

Detailed analysis

Pricing

Winner: DeepSeek V4 Flash

DeepSeek V4 Flash is listed at $0.18 per million tokens while Qwen Plus 0728 (thinking) costs $0.78 per million tokens. This makes DeepSeek V4 Flash the clear lower-cost option for high-volume usage based on the provided pricing data.

Speed and Context

Winner: DeepSeek V4 Flash

DeepSeek V4 Flash reports 103.73 tokens per second output speed and 1,048,576-token context. Qwen Plus 0728 (thinking) has no speed data and a 1,000,000-token context, giving DeepSeek the edge on known speed and a slight context advantage.

Reasoning and Language Strengths

Winner: Qwen Plus 0728 (thinking)

Qwen Plus 0728 (thinking) explicitly lists strong Chinese-English bilingual performance and a clear step-by-step reasoning style. DeepSeek V4 Flash instead highlights strong coding and STEM performance without bilingual or reasoning-style details.

Intelligence Index

Winner: DeepSeek V4 Flash

DeepSeek V4 Flash provides an intelligence_index of 46.5 while Qwen Plus 0728 (thinking) lists none. Direct comparison is not possible beyond this single reported value.

Qwen Plus 0728 (thinking)

Pros

  • +Strong Chinese-English bilingual performance
  • +Effective handling of very long inputs
  • +Solid technical and coding assistance
  • +Clear step-by-step reasoning style

Cons

  • Text-only modality
  • May still hallucinate on niche facts
  • Performance varies across domains
Full Qwen Plus 0728 (thinking) review →

DeepSeek V4 Flash

Pros

  • +Handles very large contexts effectively
  • +Strong coding and STEM performance
  • +Fast inference as a Flash variant
  • +Cost-efficient for high-volume use

Cons

  • Text-only modality
  • May lag on nuanced creative tasks
  • Standard LLM hallucination risks
Full DeepSeek V4 Flash review →

Summary: Qwen Plus 0728 (thinking) vs DeepSeek V4 Flash

Select DeepSeek V4 Flash when price, speed, and maximum context length are priorities given its concrete metrics. Choose Qwen Plus 0728 (thinking) when bilingual capabilities or explicit step-by-step reasoning matter most. Both models remain comparable on open-weight status and text-only limitations.

Frequently asked questions

DeepSeek V4 Flash at $0.18 per million tokens versus Qwen Plus 0728 (thinking) at $0.78 per million tokens.

More ai model comparisons