How is Llama 3.3 70B Instruct accessed?

It is available via Meta releases, Hugging Face, and compatible inference platforms that enable tool use.

What is the pricing for using Llama 3.3 70B Instruct?

The model weights are released openly by Meta, allowing free local or self-hosted use; third-party API access may involve separate provider fees.

Can Llama 3.3 70B Instruct perform function calling?

Yes, it supports tool use and function calling alongside its other capabilities.

Llama 3.3 70B Instruct

Q: What types of tasks suit Llama 3.3 70B Instruct best?

It is designed for long-context reasoning, code generation, complex instruction following, and multilingual text generation.

Verified

Meta's open-weight LLM excels at instruction-following and versatile text tasks.

MetaLanguage ModelsOpen

Model page

Updated 2026-06-15

About Llama 3.3 70B Instruct

The model follows the transformer-based Llama architecture refined through Meta's iterative training process. It incorporates a large context window to manage extended inputs while remaining strictly text-focused. Fine-tuning for instructions improves its ability to interpret and execute user directives accurately.

Open weights enable full local deployment, customization, and fine-tuning without external dependencies. This design supports broad experimentation across hardware setups and promotes transparency in model behavior. Its strengths lie in reliable text generation and adaptability to diverse prompts.

Common uses include powering conversational interfaces, drafting documents, and supporting coding workflows. Researchers apply it to study scaling effects and alignment techniques in open models. The instruct version suits production chatbots and creative writing assistants alike.

Capabilities

Long-context reasoning

Code generation

Complex instruction following

Multilingual text generation

Logical problem-solving

Tool use and function calling

How Llama 3.3 70B Instruct compares

Llama 3.3 70B Instruct (striped bar) vs other language models on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Llama 3.3 70B Instruct ranks #34 of 141

$0.27

Qwen3 Coder 30B A3B Instruct

$0.28

Qwen3 32B

$0.29

R1 Distill Qwen 32B

$0.30

MiMo-V2-Flash

$0.30

Step 3.5 Flash

$0.30

gpt-oss-safeguard-20b

$0.32

Llama 3.3 70B Instruct

$0.34

Llama 3.2 3B Instruct

$0.34

DeepSeek V3.2

$0.35

Phi 4 Mini Instruct

$0.40

Llama 3.1 70B Instruct

$0.40

GLM 4.7 Flash

$0.40

Qwen3 30B A3B Thinking 2507

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long-Document Reasoning

The model processes inputs up to 131072 tokens to perform detailed analysis, summarization, and logical inference across entire documents or code repositories.

Software Development Workflows

It generates, debugs, and refines code while handling complex instructions and integrating tool use or function calling in development pipelines.

Multilingual Instruction Tasks

The model follows nuanced prompts to produce accurate text in multiple languages, supporting logical problem-solving and content adaptation.

Strengths & limitations

Strengths

+Strong reasoning and instruction adherence
+Effective long-context handling
+Solid coding and analysis performance
+Open-weight accessibility

Limitations

–Text-only modality
–Can produce hallucinations
–No native real-time knowledge

Cost calculator

Estimate what Llama 3.3 70B Instruct would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00026

per request

$2.6

estimated / month

Based on Llama 3.3 70B Instruct's $0.10/1M input · $0.32/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "meta-llama/llama-3.3-70b-instruct",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: meta-llama/llama-3.3-70b-instruct

Editor's verdict

Our take on Llama 3.3 70B Instruct

Llama 3.3 70B Instruct is Meta's open-weight language models with a 131K-token context window.

At $0.32 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to strong reasoning and instruction adherence and effective long-context handling.

Did you find this helpful?

Frequently asked questions

The model provides a context window of 131072 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Llama models

Sibling versions in the Llama family from Meta.

Llama 4 Maverick

Meta · Multimodal

Verified

Meta's open multimodal model for long-context text and image tasks.

OpenII 18.41049K ctx$0.60/1M out

Llama 4 Scout

Meta · Multimodal

Verified

Meta's open multimodal model for long text and image sequences.

OpenII 13.510000K ctx$0.30/1M out

Llama Guard 4 12B

Meta · Multimodal

Verified

Meta's open multimodal model for safety classification of text and images.

Open164K ctx$0.18/1M out

Llama 3.2 3B Instruct

Meta · Language Models

Verified

Compact open-weight model for efficient instruction following and chat.

Open131K ctx$0.34/1M out

Llama 3.1 8B Instruct

Meta · Language Models

Verified

Meta's efficient open model for instruction following and chat.

Open131K ctx$0.03/1M out

Llama 3.2 1B Instruct

Meta · Language Models

Verified

Meta's compact 1B Llama model for fast, efficient instruction following.

Open131K ctx$0.20/1M out

Promote Llama 3.3 70B Instruct

Add this badge to your website, or share the tool.

DFeatured on DhanasviLlama 3.3 70B Instruct 1

Llama 3.3 70B Instruct

About Llama 3.3 70B Instruct

Capabilities