Gemma 4 26B A4B
VerifiedGoogle's open multimodal model for text, image, and video with 262k context.
About Gemma 4 26B A4B
The model combines a 26B parameter architecture with native support for image, text, and video modalities. Its 262144-token context window allows handling of long multimodal sequences in a single pass. As an open-weight release from Google, it provides direct access to weights for customization and local inference.
Typical usage includes multimodal content analysis, video understanding, and cross-modal generation tasks. Developers leverage its open nature for fine-tuning on domain-specific image-text-video datasets. The design emphasizes broad accessibility while maintaining strong performance on mixed-modality inputs.
Capabilities
How Gemma 4 26B A4B compares
Gemma 4 26B A4B (striped bar) vs other multimodal on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Gemma 4 26B A4B ranks #14 of 124
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Extended Video Content Review
Processes hours of video input alongside transcripts to identify patterns, summarize events, and answer queries spanning the full duration.
Large-Scale Multimodal Reports
Integrates text, images, and charts from lengthy documents to generate reasoned summaries and extract cross-referenced insights.
Complex Instruction Execution
Follows detailed multi-step prompts that combine visual analysis with long-context text generation for tasks like research synthesis.
Strengths & limitations
Strengths
- +Large 256k-token context window
- +Native support for image, text, and video inputs
- +Efficient 26B-scale architecture
Limitations
- –No audio modality support
- –May trail larger models on complex reasoning tasks
- –Higher inference cost for video processing
Pricing by provider
Live per-provider pricing & uptime, routed via OpenRouter. Prices are USD per 1M tokens.
| Provider | Input /1M | Output /1M | Context | Uptime |
|---|---|---|---|---|
| DekaLLM(bf16) | $0.06 | $0.33 | 262K | 93.1% |
| DeepInfra(fp8) | $0.07 | $0.34 | 262K | 99.2% |
| Cloudflare | $0.10 | $0.30 | 256K | 99.8% |
| Ambient | $0.10 | $0.30 | 262K | — |
| SiliconFlow(fp8) | $0.12 | $0.40 | 262K | 99.9% |
| Parasail(bf16) | $0.13 | $0.40 | 262K | 99.4% |
| Novita(bf16) | $0.13 | $0.40 | 262K | 99.8% |
| NextBit(bf16) | $0.13 | $0.40 | 262K | 99.3% |
| $0.15 | $0.60 | 262K | 100.0% | |
| Venice(bf16) | $0.16 | $0.50 | 256K | 99.6% |
Cost calculator
Estimate what Gemma 4 26B A4B would cost for your usage.
Based on Gemma 4 26B A4B's $0.06/1M input · $0.33/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "google/gemma-4-26b-a4b-it",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: google/gemma-4-26b-a4b-it
Editor's verdict
Gemma 4 26B A4B is Google's open-weight multimodal with a 262K-token context window.
At $0.33 per 1M output tokens, it is very cost-efficient for its class, served by 10 providers.
As an open-weight model you can self-host it or call it through a hosted API.
Best suited to large 256k-token context window and native support for image, text, and video inputs.
Frequently asked questions
The model supports a context length of 262144 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Gemma models
Sibling versions in the Gemma family from Google.