Where can pricing details for this model be found?

Pricing information is available on the official DeepSeek platform or API documentation.

How can users access R1 Distill Llama 70B?

Access is provided through DeepSeek's available interfaces for this LLM.

Is the model suitable for instruction-following tasks?

Yes, it supports instruction following along with technical text analysis and related capabilities.

R1 Distill Llama 70B by DeepSeek — Specs, Pricing, Benchmarks (2026)

About R1 Distill Llama 70B

This model results from distilling DeepSeek R1 capabilities into the Llama 70B base architecture. The approach preserves core strengths while enabling broad accessibility through open weights. It operates exclusively in text modality without additional input types.

Its large context window supports extended documents and multi-turn interactions without losing coherence. Open-weight availability facilitates local deployment, fine-tuning, and integration into custom pipelines. The design emphasizes efficiency for text generation workloads.

Common uses include analytical tasks, content creation, and technical documentation support. Researchers apply it in academic settings while developers integrate it into applications needing reliable language models. The 70B scale balances performance with practical resource demands.

Capabilities

Long-context reasoning

Code generation

Mathematical problem solving

Step-by-step reasoning

Instruction following

Technical text analysis

Benchmarks & performance

Independent evaluation scores and measured speed.

16

Intelligence Index

11.4

Coding Index

49

Tokens / sec

0.61s

Time to first token

Source: Artificial Analysis

How R1 Distill Llama 70B compares

R1 Distill Llama 70B (striped bar) vs other language models on intelligence, speed and price.

Intelligence

Artificial Analysis Intelligence Index · Higher is better · R1 Distill Llama 70B ranks #49 of 67

22

GLM 4.7 Flash

20

Qwen3 Next 80B A3B Instruct

20

Qwen3 Coder 30B A3B Instruct

20

Qwen3 235B A22B

17

R1 Distill Qwen 32B

17

DeepSeek V3

16

R1 Distill Llama 70B

16

Qwen2.5 72B Instruct

15

Qwen3 32B

14

Command A

13

Mistral Large 2407

13

Qwen2.5 Coder 32B Instruct

13

Qwen3 14B

Speed

Output tokens per second · Higher is better · R1 Distill Llama 70B ranks #38 of 45

62

Qwen3 Max

60

Qwen3 235B A22B Instruct 2507

60

Qwen3 235B A22B

59

GLM 4.6

52

Qwen3 Max Thinking

52

MiMo-V2.5-Pro

49

R1 Distill Llama 70B

49

GLM 4.5

45

Qwen3.6 Max Preview

45

MiniMax M2.7

39

Qwen3 8B

33

Phi 4

25

Kimi K2 0905

Price

USD per 1M output tokens · Lower is better · R1 Distill Llama 70B ranks #73 of 141

$0.78

Qwen Plus 0728

$0.78

Qwen Plus 0728

$0.78

Qwen3 Next 80B A3B Thinking

$0.79

DeepSeek V3.1

$0.80

Qwen3 Coder Next

$0.80

DeepSeek V3

$0.80

R1 Distill Llama 70B

$0.80

Skyfall 36B V2

$0.80

Coder Large

$0.85

Trinity Large Thinking

$0.85

Llama 3.1 Euryale 70B v2.2

$0.85

GLM 4.5 Air

$0.87

DeepSeek V4 Pro

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long technical document analysis

The model processes and extracts insights from extensive technical texts within its 128000 token context window while maintaining coherence across large inputs.

Complex code generation projects

It generates, debugs, and refines substantial codebases by combining code generation with step-by-step reasoning and instruction following.

Advanced mathematical problem solving

Users apply it to multi-step math challenges that require detailed explanations and long-context reasoning for accurate solutions.

Strengths & limitations

Strengths

+Efficient reasoning via distillation
+Strong coding and math performance
+Handles 128k token contexts
+Text-only optimization

Limitations

–Text modality only
–Distilled model may lag full-scale versions
–No built-in multimodal abilities

Cost calculator

Estimate what R1 Distill Llama 70B would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00120

per request

$12

estimated / month

Based on R1 Distill Llama 70B's $0.80/1M input · $0.80/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "deepseek/deepseek-r1-distill-llama-70b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: deepseek/deepseek-r1-distill-llama-70b

Editor's verdict

Our take on R1 Distill Llama 70B

R1 Distill Llama 70B is DeepSeek's open-weight language models with a 128K-token context window.

On independent testing it scores 16 on the Artificial Analysis Intelligence Index, running at roughly 49 tokens per second with about 0.61s to first token.

At $0.80 per 1M output tokens, it is very cost-efficient for its class.

As an open-weight model you can self-host it or call it through a hosted API.

Best suited to efficient reasoning via distillation and strong coding and math performance.

Did you find this helpful?

Frequently asked questions

The model handles a context window of 128000 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Other R models

Sibling versions in the R family from DeepSeek.

R1 0528

DeepSeek · Language Models

Verified

DeepSeek's open LLM handles extensive text with a 163k-token context.

OpenII 27.1164K ctx$2.15/1M out

R1

DeepSeek · Language Models

Verified

DeepSeek R1 handles massive text contexts as an open LLM.

OpenII 27.1164K ctx$2.50/1M out

R1 Distill Qwen 32B

DeepSeek · Language Models

Verified

Distilled 32B reasoning model with extended context for efficient inference.

OpenII 17.2128K ctx$0.29/1M out

Similar models

Other language models worth comparing.

DeepSeek V4 Pro

DeepSeek · Language Models

Verified

Open-weight LLM built for million-token text contexts.

OpenII 51.51049K ctx$0.87/1M out

DeepSeek V4 Flash

DeepSeek · Language Models

Verified

Open-weight LLM built for million-token text context handling.

OpenII 46.51049K ctx$0.18/1M out

Qwen3 Coder Plus

Alibaba Qwen · Language Models

Verified

Open-weight coder built for million-token codebases and complex tasks.

Open1000K ctx$3.25/1M out

R1 Distill Llama 70B

About R1 Distill Llama 70B

Capabilities

Benchmarks & performance

How R1 Distill Llama 70B compares

Intelligence

Speed

Price

Best for

Long technical document analysis

Complex code generation projects

Advanced mathematical problem solving

Strengths & limitations

Strengths

Limitations

Cost calculator

Quick start

Editor's verdict

Frequently asked questions

What is the context length supported by R1 Distill Llama 70B?

Who developed the R1 Distill Llama 70B model?

Where can pricing details for this model be found?

How can users access R1 Distill Llama 70B?

Is the model suitable for instruction-following tasks?

User reviews

Other R models

R1 0528

R1

R1 Distill Qwen 32B

Similar models

DeepSeek V4 Pro

DeepSeek V4 Flash

Qwen3 Coder Plus

Promote R1 Distill Llama 70B