Skip to content
R1 Distill Llama 70B logo

R1 Distill Llama 70B

Verified

DeepSeek's distilled Llama 70B for advanced reasoning.

DeepSeekLanguage ModelsOpenII 16
Model page
Updated 2026-06-15

About R1 Distill Llama 70B

This model results from distilling DeepSeek R1 capabilities into the Llama 70B base architecture. The approach preserves core strengths while enabling broad accessibility through open weights. It operates exclusively in text modality without additional input types.

Its large context window supports extended documents and multi-turn interactions without losing coherence. Open-weight availability facilitates local deployment, fine-tuning, and integration into custom pipelines. The design emphasizes efficiency for text generation workloads.

Common uses include analytical tasks, content creation, and technical documentation support. Researchers apply it in academic settings while developers integrate it into applications needing reliable language models. The 70B scale balances performance with practical resource demands.

Capabilities

Long-context reasoning
Code generation
Mathematical problem solving
Step-by-step reasoning
Instruction following
Technical text analysis

Benchmarks & performance

Independent evaluation scores and measured speed.

16
Intelligence Index
11.4
Coding Index
49
Tokens / sec
0.61s
Time to first token

Source: Artificial Analysis

How R1 Distill Llama 70B compares

R1 Distill Llama 70B (striped bar) vs other language models on intelligence, speed and price.

Intelligence

Artificial Analysis Intelligence Index · Higher is better · R1 Distill Llama 70B ranks #49 of 67

22
GLM 4.7 Flash
20
Qwen3 Next 80B A3B Instruct
20
Qwen3 Coder 30B A3B Instruct
20
Qwen3 235B A22B
17
R1 Distill Qwen 32B
17
DeepSeek V3
16
R1 Distill Llama 70B
16
Qwen2.5 72B Instruct
15
Qwen3 32B
14
Command A
13
Mistral Large 2407
13
Qwen2.5 Coder 32B Instruct
13
Qwen3 14B

Speed

Output tokens per second · Higher is better · R1 Distill Llama 70B ranks #38 of 45

62
Qwen3 Max
60
Qwen3 235B A22B Instruct 2507
60
Qwen3 235B A22B
59
GLM 4.6
52
Qwen3 Max Thinking
52
MiMo-V2.5-Pro
49
R1 Distill Llama 70B
49
GLM 4.5
45
Qwen3.6 Max Preview
45
MiniMax M2.7
39
Qwen3 8B
33
Phi 4
25
Kimi K2 0905

Price

USD per 1M output tokens · Lower is better · R1 Distill Llama 70B ranks #73 of 141

$0.78
Qwen Plus 0728
$0.78
Qwen Plus 0728
$0.78
Qwen3 Next 80B A3B Thinking
$0.79
DeepSeek V3.1
$0.80
Qwen3 Coder Next
$0.80
DeepSeek V3
$0.80
R1 Distill Llama 70B
$0.80
Skyfall 36B V2
$0.80
Coder Large
$0.85
Trinity Large Thinking
$0.85
Llama 3.1 Euryale 70B v2.2
$0.85
GLM 4.5 Air
$0.87
DeepSeek V4 Pro

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long technical document analysis

The model processes and extracts insights from extensive technical texts within its 128000 token context window while maintaining coherence across large inputs.

Complex code generation projects

It generates, debugs, and refines substantial codebases by combining code generation with step-by-step reasoning and instruction following.

Advanced mathematical problem solving

Users apply it to multi-step math challenges that require detailed explanations and long-context reasoning for accurate solutions.

Strengths & limitations

Strengths

  • +Efficient reasoning via distillation
  • +Strong coding and math performance
  • +Handles 128k token contexts
  • +Text-only optimization

Limitations

  • Text modality only
  • Distilled model may lag full-scale versions
  • No built-in multimodal abilities

Cost calculator

Estimate what R1 Distill Llama 70B would cost for your usage.

$0.00120
per request
$12
estimated / month

Based on R1 Distill Llama 70B's $0.80/1M input · $0.80/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "deepseek/deepseek-r1-distill-llama-70b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: deepseek/deepseek-r1-distill-llama-70b

Editor's verdict

Our take on R1 Distill Llama 70B

R1 Distill Llama 70B is DeepSeek's open-weight language models with a 128K-token context window.

On independent testing it scores 16 on the Artificial Analysis Intelligence Index, running at roughly 49 tokens per second with about 0.61s to first token.

At $0.80 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to efficient reasoning via distillation and strong coding and math performance.

Did you find this helpful?

Frequently asked questions

The model handles a context window of 128000 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Other R models

Sibling versions in the R family from DeepSeek.

Promote R1 Distill Llama 70B

Add this badge to your website, or share the tool.

DFeatured on DhanasviR1 Distill Llama 70B 1