How do I access Llama 4 Scout?

Access is available via Meta's official AI platforms and approved partner APIs.

Is Llama 4 Scout free to use?

Meta offers Llama models under different licensing and usage plans; check the Meta AI site for current options.

What types of input does the multimodal model accept?

As a multimodal model it processes both text and visual inputs within the supported context length.

Can Llama 4 Scout be fine-tuned for specific use cases?

Meta provides guidance on fine-tuning and deployment for enterprise applications on its developer resources.

Llama 4 Scout by Meta — Specs, Pricing, Benchmarks (2026)

About Llama 4 Scout

Llama 4 Scout uses a multimodal architecture designed by Meta to accept both text and image inputs. Its 10 million token context window allows the model to handle very long combined sequences without truncation. The weights are openly available for inspection and modification.

The design emphasizes flexibility for tasks that require sustained attention across large volumes of mixed media. Open-weight release lowers barriers for academic and commercial experimentation. Users can fine-tune or deploy the model in environments where data privacy or customization is important.

Typical applications include document analysis that incorporates diagrams, long-form visual storytelling, and research involving extensive image-text corpora. Developers often integrate it into pipelines that need to maintain coherence over thousands of pages or image collections.

Capabilities

Long-context reasoning

Multimodal text and image understanding

Vision-language analysis

Large-scale document processing

Complex instruction following

How Llama 4 Scout compares

Llama 4 Scout (striped bar) vs other multimodal on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Llama 4 Scout ranks #11 of 124

$0.15

Qwen3.5-9B

$0.15

Gemma 3 12B

$0.20

Ministral 3 14B 2512

$0.20

Mistral Small 3.2 24B

$0.26

Qwen3.5-Flash

$0.28

MiMo-V2.5

$0.30

Llama 4 Scout

$0.30

Seed 1.6 Flash

$0.30

Voxtral Small 24B 2507

$0.33

Gemma 4 26B A4B

$0.35

Gemma 4 31B

$0.40

Gemini 2.5 Flash Lite

$0.40

GPT-4.1 Nano

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-Document Multimodal Analysis

Llama 4 Scout excels at ingesting and reasoning over entire books or research archives that combine text with images, thanks to its 10 million token context window.

Extended Multimodal Reasoning Tasks

The model handles complex queries that span thousands of pages of mixed text and visual data, such as reviewing technical manuals with diagrams in one pass.

Large-Scale Knowledge Integration

It supports synthesizing insights across massive multimodal collections like corporate archives containing reports, charts, and photographs.

Strengths & limitations

Strengths

+Extremely large context window
+Native multimodal input support
+Strong reasoning over long inputs

Limitations

–High compute cost at maximum context
–Limited to text and image modalities only
–May exhibit latency on very long sequences

Cost calculator

Estimate what Llama 4 Scout would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00025

per request

$2.5

estimated / month

Based on Llama 4 Scout's $0.10/1M input · $0.30/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "meta-llama/llama-4-scout",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: meta-llama/llama-4-scout

Editor's verdict

Our take on Llama 4 Scout

Llama 4 Scout is Meta's open-weight multimodal with a 10000K-token context window.

At $0.30 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to extremely large context window and native multimodal input support.

Did you find this helpful?

Frequently asked questions

Llama 4 Scout provides a context window of 10,000,000 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Other Llama models

Sibling versions in the Llama family from Meta.

Llama 4 Maverick

Meta · Multimodal

Verified

Meta's open multimodal model for long-context text and image tasks.

Open1049K ctx$0.60/1M out

Similar models

Other multimodal worth comparing.

Gemini 2.5 Flash Lite

Google · Multimodal

Verified

Google's fast, lightweight multimodal model for text, image, audio, and video tasks.

Closed1049K ctx$0.40/1M out

Gemini 2.5 Pro

Google · Multimodal

Verified

Google's multimodal model for long-context reasoning across media types.

Closed1049K ctx$10.00/1M out

GPT-5.5 Pro

OpenAI · Multimodal

Verified

Multimodal model handling over a million tokens of context.

Closed1050K ctx$180.00/1M out

Llama 4 Scout

About Llama 4 Scout

Capabilities

How Llama 4 Scout compares

Price

Best for

Long-Document Multimodal Analysis

Extended Multimodal Reasoning Tasks

Large-Scale Knowledge Integration

Strengths & limitations

Strengths

Limitations

Cost calculator

Quick start

Editor's verdict

Frequently asked questions

What context length does Llama 4 Scout support?

How do I access Llama 4 Scout?

Is Llama 4 Scout free to use?

What types of input does the multimodal model accept?

Can Llama 4 Scout be fine-tuned for specific use cases?

User reviews

Other Llama models

Llama 4 Maverick

Similar models

Gemini 2.5 Flash Lite

Gemini 2.5 Pro

GPT-5.5 Pro

Promote Llama 4 Scout