How can I access Llama 4 Maverick?

It is available through Meta's official release channels and compatible platforms that host Llama model weights.

Is Llama 4 Maverick free to use?

As with prior Llama releases from Meta, the model weights are provided under an open license for research and commercial use subject to terms.

What modalities does Llama 4 Maverick support?

It is a multimodal model capable of processing text together with images and other visual inputs.

Can Llama 4 Maverick be fine-tuned for custom tasks?

Yes, the open weights allow fine-tuning on domain-specific multimodal datasets using standard frameworks.

Llama 4 Maverick by Meta — Specs, Pricing, Benchmarks (2026)

About Llama 4 Maverick

Llama 4 Maverick features a multimodal architecture that integrates text and image processing. Its context window reaches 1,048,576 tokens, enabling analysis of lengthy inputs. The model is distributed with open weights by Meta.

Strengths include flexible handling of combined visual and textual data over extended sequences. Developers can access and modify the weights for specialized uses. This design promotes experimentation in multimodal scenarios.

Typical usage covers document understanding, visual question answering, and extended conversational agents. Researchers leverage it for projects needing both image interpretation and large-scale text context. The open-weight release facilitates community-driven improvements.

Capabilities

Long-context reasoning

Multimodal text and image understanding

Image analysis and description

Text generation and reasoning

Instruction following

Code generation

How Llama 4 Maverick compares

Llama 4 Maverick (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Llama 4 Maverick ranks #27 of 132

$0.40

Gemini 2.5 Flash Lite Preview 09-2025

$0.40

Seed-2.0-Mini

$0.42

Qwen3 VL 32B Instruct

$0.50

Qwen3 VL 8B Instruct

$0.52

Qwen3 VL 30B A3B Instruct

$0.55

Mistral Small 3.1 24B

$0.60

Llama 4 Maverick

$0.60

Mistral Small 4

$0.60

Saba

$0.88

Qwen3 VL 235B A22B Instruct

$0.90

Codestral 2508

$0.90

GLM 4.6V

$1.0

Qwen3.6 35B A3B

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-context multimodal research

Handles entire research papers or datasets exceeding 500k tokens while interpreting accompanying charts and diagrams in a single pass.

Extended video and audio transcription

Processes hour-long multimodal recordings with synchronized visual and textual elements for detailed summarization or indexing.

Complex visual reasoning over documents

Analyzes lengthy reports containing mixed text, tables, and images to extract insights without chunking or external tools.

Strengths & limitations

Strengths

+Very large 1M token context window
+Native multimodal support for text and images
+Open weights from Meta
+Strong general reasoning performance

Limitations

–High compute requirements for full context
–Limited to text and image modalities
–Potential for hallucinations on complex tasks

Cost calculator

Estimate what Llama 4 Maverick would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00045

per request

$4.5

estimated / month

Based on Llama 4 Maverick's $0.15/1M input · $0.60/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "meta-llama/llama-4-maverick",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: meta-llama/llama-4-maverick

Editor's verdict

Our take on Llama 4 Maverick

Llama 4 Maverick is Meta's open-weight multimodal with a 1049K-token context window.

At $0.60 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to very large 1m token context window and native multimodal support for text and images.

Did you find this helpful?

Frequently asked questions

The model supports a context length of 1,048,576 tokens as specified in its technical details.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Other Llama models

Sibling versions in the Llama family from Meta.

Llama 4 Scout

Meta · Multimodal

Verified

Meta's open multimodal model for long text and image sequences.

Open10000K ctx$0.30/1M out

Similar models

Other multimodal worth comparing.

Grok 4.20

xAI · Multimodal

Verified

Multimodal model with a 2 million token context window.

Closed2000K ctx$2.50/1M out

GPT-5 Nano

OpenAI · Multimodal

Verified

Multimodal model handling long text, image, and file inputs.

Closed400K ctx$0.40/1M out

Gemini 2.5 Flash Lite

Google · Multimodal

Verified

Google's fast, lightweight multimodal model for text, image, audio, and video tasks.

Closed1049K ctx$0.40/1M out

Llama 4 Maverick

About Llama 4 Maverick

Capabilities

How Llama 4 Maverick compares

Price

Best for

Long-context multimodal research

Extended video and audio transcription

Complex visual reasoning over documents

Strengths & limitations

Strengths

Limitations

Cost calculator

Quick start

Editor's verdict

Frequently asked questions

What is the context window size of Llama 4 Maverick?

How can I access Llama 4 Maverick?

Is Llama 4 Maverick free to use?

What modalities does Llama 4 Maverick support?

Can Llama 4 Maverick be fine-tuned for custom tasks?

User reviews

Other Llama models

Llama 4 Scout

Similar models

Grok 4.20

GPT-5 Nano

Gemini 2.5 Flash Lite

Promote Llama 4 Maverick