Gemini 2.5 Flash Lite
VerifiedGoogle's fast, lightweight multimodal model for text, image, audio, and video tasks.
About Gemini 2.5 Flash Lite
Built as a streamlined member of the Gemini family, the model prioritizes speed and resource efficiency while supporting multiple input types. Its architecture enables simultaneous handling of text, images, audio, and video within one large context window. This design suits scenarios where rapid multimodal integration is required without open-weight access.
Strengths center on balanced performance for real-time applications and large-scale input processing. Typical usage includes content analysis, media summarization, and interactive tools that combine visual, auditory, and textual data streams in a single query.
Capabilities
How Gemini 2.5 Flash Lite compares
Gemini 2.5 Flash Lite (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Gemini 2.5 Flash Lite ranks #16 of 124
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long-context document reasoning
Processes entire code repositories, research papers, or legal contracts up to 1M tokens to deliver coherent summaries and answers without chunking.
Video and vision analysis
Understands video clips and images to extract events, objects, and narratives while combining them with text or audio inputs.
Multimodal file workflows
Handles mixed files containing text, images, and audio to perform transcription, reasoning, and rapid text output in a single pass.
Strengths & limitations
Strengths
- +Optimized for speed and efficiency
- +Strong multimodal integration
- +Handles very large contexts effectively
- +Suitable for high-volume, lightweight tasks
Limitations
- –Reduced depth on highly complex reasoning
- –May underperform full-scale models on nuanced tasks
- –Lite design trades capability for cost and latency
Cost calculator
Estimate what Gemini 2.5 Flash Lite would cost for your usage.
Based on Gemini 2.5 Flash Lite's $0.10/1M input · $0.40/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "google/gemini-2.5-flash-lite",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: google/gemini-2.5-flash-lite
Editor's verdict
Gemini 2.5 Flash Lite is Google's proprietary multimodal with a 1049K-token context window.
At $0.40 per 1M output tokens, it is very cost-efficient for its class.
It is available through Google's API and aggregators like OpenRouter.
Best suited to optimized for speed and efficiency and strong multimodal integration.
Frequently asked questions
The model supports a context length of 1,048,576 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Gemini models
Sibling versions in the Gemini family from Google.