What is the pricing for Seed-2.0-Mini?

Pricing details are not specified in the available model information.

How can users access Seed-2.0-Mini?

Access methods are not detailed beyond its development by Bytedance-seed.

What multimodal capabilities does Seed-2.0-Mini offer?

It supports long-context text processing, image understanding, video comprehension, multimodal reasoning, cross-modal generation, and mixed-media instruction following.

Seed-2.0-Mini

Verified

Efficient multimodal model for long-context text, image, and video tasks.

Bytedance-seedMultimodalClosed

Vision

Model page

Updated 2026-06-14

About Seed-2.0-Mini

Seed-2.0-Mini employs a unified architecture that ingests and reasons over mixed text, image, and video inputs. Its large context capacity supports extended sequences such as full-length videos paired with detailed instructions or documents. The compact design balances capability with deployment efficiency while remaining proprietary.

Typical applications include video analysis, image captioning with narrative context, and multi-turn multimodal conversations. Developers integrate it into content platforms, creative tools, and enterprise systems requiring synchronized understanding of visual and textual data.

Capabilities

Long-context text processing

Image understanding and analysis

Video comprehension and summarization

Multimodal reasoning across inputs

Cross-modal content generation

Instruction following with mixed media

How Seed-2.0-Mini compares

Seed-2.0-Mini (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Seed-2.0-Mini ranks #8 of 63

$0.15

Ministral 3 8B 2512

$0.20

Ministral 3 14B 2512

$0.28

MiMo-V2.5

$0.30

Seed 1.6 Flash

$0.30

Voxtral Small 24B 2507

$0.40

Gemini 2.5 Flash Lite Preview 09-2025

$0.40

Seed-2.0-Mini

$0.42

Qwen3 VL 32B Instruct

$0.60

Mistral Small 4

$0.88

Qwen3 VL 235B A22B Instruct

$0.90

GLM 4.6V

$0.97

Qwen3.6 35B A3B

$1.1

Qwen3.6 Flash

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-form Video Summarization

Seed-2.0-Mini excels at processing extended video inputs with accompanying text for detailed comprehension and summarization tasks.

Mixed-Media Instruction Following

The model handles complex instructions that combine text, images, and video, enabling accurate multimodal reasoning and cross-modal content generation.

Extended Document Analysis with Visuals

It supports long-context text processing alongside image understanding for in-depth analysis of documents containing both textual and visual elements.

Strengths & limitations

Strengths

+Very large context window supports lengthy documents and conversations
+Native handling of video alongside images and text
+Unified processing of multiple modalities in one model

Limitations

–Mini size may limit depth on highly complex reasoning tasks
–Performance can vary with maximum context lengths
–No audio modality support

Cost calculator

Estimate what Seed-2.0-Mini would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00030

per request

estimated / month

Based on Seed-2.0-Mini's $0.10/1M input · $0.40/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "bytedance-seed/seed-2.0-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: bytedance-seed/seed-2.0-mini

Editor's verdict

Our take on Seed-2.0-Mini

Seed-2.0-Mini is Bytedance-seed's proprietary multimodal with a 262K-token context window.

At $0.40 per 1M output tokens, it is very cost-efficient for its class.

It is available through Bytedance-seed's API and aggregators like OpenRouter.

Best suited to very large context window supports lengthy documents and conversations and native handling of video alongside images and text.

Did you find this helpful?

Frequently asked questions

The model provides a context window of 262144 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Seed models

Sibling versions in the Seed family from Bytedance-seed.

Seed 1.6 Flash

Bytedance-seed · Multimodal

Verified

Fast multimodal model for text, image, and video inputs.

Closed262K ctx$0.30/1M out

Seed 1.6

Bytedance-seed · Multimodal

Verified

Seed 1.6 processes image, text, and video with a 262k-token context.

Closed262K ctx$2.00/1M out

Promote Seed-2.0-Mini

Add this badge to your website, or share the tool.

DFeatured on DhanasviSeed-2.0-Mini 1

Seed-2.0-Mini

About Seed-2.0-Mini

Capabilities