Llama 3.3 Euryale 70B
VerifiedExtended-context LLM for detailed text generation tasks.
About Llama 3.3 Euryale 70B
Built on the Llama 3.3 architecture, this 70B-parameter model processes up to 131k tokens of text input and output. Its design emphasizes coherent handling of lengthy documents and multi-turn conversations without requiring local hardware.
Because the weights remain closed, users access the model through hosted inference endpoints. This setup suits applications that need reliable, high-volume text processing while avoiding the overhead of self-hosting large models.
Typical usage includes summarization, drafting, and analysis of long documents where maintaining context across many tokens is essential. The model delivers consistent performance for professional and creative writing workflows that rely on extended context.
Capabilities
How Llama 3.3 Euryale 70B compares
Llama 3.3 Euryale 70B (striped bar) vs other language models on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Llama 3.3 Euryale 70B ranks #63 of 141
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long-Form Narrative Projects
The 131072-token context supports sustained reasoning across extended documents, enabling coherent development of multi-chapter stories without losing track of earlier plot points.
Character-Driven Role-Play Sessions
Strong performance in role-playing and character simulation combined with uncensored responses allows users to maintain consistent personas over long interactive exchanges.
Creative Storytelling Assistance
The model follows detailed instructions while generating original narratives, making it effective for authors seeking help with plot outlines, dialogue, and world-building.
Strengths & limitations
Strengths
- +High-quality creative writing and roleplay
- +Strong coherence across long contexts
- +Flexible and expressive output style
- +Good at maintaining character consistency
Limitations
- –Text-only modality
- –May favor creative flair over strict factual accuracy
- –Fine-tune can produce less predictable behavior on non-RP tasks
Cost calculator
Estimate what Llama 3.3 Euryale 70B would cost for your usage.
Based on Llama 3.3 Euryale 70B's $0.65/1M input · $0.75/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "sao10k/l3.3-euryale-70b",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: sao10k/l3.3-euryale-70b
Editor's verdict
Llama 3.3 Euryale 70B is Sao10k's proprietary language models with a 131K-token context window.
At $0.75 per 1M output tokens, it is very cost-efficient for its class.
It is available through Sao10k's API and aggregators like OpenRouter.
Best suited to high-quality creative writing and roleplay and strong coherence across long contexts.
Frequently asked questions
The model provides a context window of 131072 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Llama models
Sibling versions in the Llama family from Sao10k.