Nemotron 3 Super
VerifiedNVIDIA's closed LLM for million-token text processing.
About Nemotron 3 Super
NVIDIA engineered Nemotron 3 Super as a proprietary LLM with an expansive one-million-token context limit. This design enables the model to ingest and reason over entire books, codebases, or multi-hour transcripts in a single pass while preserving factual consistency.
Because the weights remain closed, deployment occurs through NVIDIA-controlled channels that emphasize security and compliance. The text-only modality focuses computational resources on language understanding and generation without multimodal overhead.
Organizations typically apply the model to legal discovery, technical research synthesis, and enterprise knowledge retrieval. Its scale suits scenarios where retaining full context across hundreds of thousands of tokens improves answer accuracy and reduces fragmentation.
Capabilities
How Nemotron 3 Super compares
Nemotron 3 Super (striped bar) vs other language models on intelligence, speed and price.
Price
USD per 1M output tokens · Lower is better · Nemotron 3 Super ranks #25 of 72
Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).
Best for
Long-Form Document Analysis
Processes and reasons over entire books, legal contracts, or research papers in a single pass thanks to its 1M-token context window.
Enterprise Codebase Understanding
Navigates and explains large multi-file code repositories while retaining full project context for refactoring or security reviews.
Extended Multi-Turn Research
Maintains coherent dialogue across dozens of iterative queries when exploring complex technical or scientific topics.
Strengths & limitations
Strengths
- +Handles up to 1M token contexts
- +NVIDIA-optimized inference efficiency
- +Strong performance on technical domains
- +Suitable for enterprise-scale text tasks
Limitations
- –Text-only modality
- –No native multimodal support
- –Large context increases compute cost
Pricing by provider
Live per-provider pricing & uptime, routed via OpenRouter. Prices are USD per 1M tokens.
| Provider | Input /1M | Output /1M | Context | Uptime |
|---|---|---|---|---|
| DekaLLM(fp8) | $0.09 | $0.45 | 262K | 98.2% |
| DeepInfra(bf16) | $0.10 | $0.50 | 262K | 96.7% |
| DigitalOcean | $0.30 | $0.65 | 1000K | — |
| Nebius(fp4) | $0.30 | $0.90 | 262K | — |
Cost calculator
Estimate what Nemotron 3 Super would cost for your usage.
Based on Nemotron 3 Super's $0.09/1M input · $0.45/1M output. Estimate only — actual cost varies by provider and caching.
Quick start
OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "nvidia/nemotron-3-super-120b-a12b",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);Model slug: nvidia/nemotron-3-super-120b-a12b
Editor's verdict
Nemotron 3 Super is NVIDIA's proprietary language models with a 1000K-token context window.
At $0.45 per 1M output tokens, it is very cost-efficient for its class, served by 4 providers.
It is available through NVIDIA's API and aggregators like OpenRouter.
Best suited to handles up to 1m token contexts and nvidia-optimized inference efficiency.
Frequently asked questions
The model supports a context window of 1,000,000 tokens.
User reviews
Real, verified reviews from the community shape this model's rating.
Loading reviews…
Other Nemotron models
Sibling versions in the Nemotron family from NVIDIA.