Does Hermes 4 405B support function calling?

Yes, it includes built-in capabilities for tool use and function calling.

Where can I access Hermes 4 405B?

It is offered by Nous Research through their model release channels and compatible inference platforms.

What is the pricing for Hermes 4 405B?

Current pricing details are available directly from Nous Research or their hosting partners.

Is Hermes 4 405B suitable for code generation?

Yes, its capabilities explicitly include code generation and multi-step planning for programming tasks.

Hermes 4 405B

Verified

Open-weight LLM with 131k context for complex text tasks.

Nous ResearchLanguage ModelsOpen

Model page

Updated 2026-06-15

About Hermes 4 405B

The model follows a transformer-based design optimized for extended sequences. Nous Research released it as open weights to support broad community access and modification. This approach prioritizes flexibility for developers working with substantial text inputs.

Its primary strengths lie in managing lengthy documents and multi-turn conversations without truncation. The open-weight nature allows independent evaluation and adaptation across hardware setups. Text modality keeps the focus on pure language modeling performance.

Researchers commonly use it for experiments in summarization, reasoning, and retrieval-augmented generation. Fine-tuning workflows benefit from the large context when adapting to specialized domains. Production deployments often target on-premise environments where control over model weights is essential.

Capabilities

Long-context reasoning

Advanced instruction following

Tool use and function calling

Code generation

Multi-step planning

Role-playing and creative writing

How Hermes 4 405B compares

Hermes 4 405B (striped bar) vs other language models on intelligence, speed and price.

Price

USD per 1M output tokens · Lower is better · Hermes 4 405B ranks #66 of 78

$2.2

MiniMax M1

$2.2

GLM 4.5

$2.5

Nemotron 3 Ultra

$2.5

Kimi K2 Thinking

$2.5

Kimi K2 0905

$3.0

Relace Search

$3.0

Hermes 4 405B

$3.1

GLM 5.1

$3.3

Qwen3 Coder Plus

$3.8

Qwen3.7 Max

$3.9

Qwen3 Max Thinking

$3.9

Qwen3 Max

$4.0

GLM 5 Turbo

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Long Document Analysis

Processes and reasons over extensive inputs up to 131072 tokens, making it suitable for summarizing research papers or legal contracts with sustained accuracy.

Agentic Tool Integration

Leverages tool use and function calling alongside multi-step planning to build reliable autonomous workflows for data retrieval or automation tasks.

Creative Narrative Development

Supports role-playing and creative writing with advanced instruction following, enabling consistent character development across multi-turn storytelling sessions.

Strengths & limitations

Strengths

+Strong adherence to complex instructions
+Effective handling of 128k context
+Good at agentic and tool-augmented workflows
+Open weights enabling customization

Limitations

–Text-only modality
–High inference compute requirements
–May exhibit typical LLM hallucinations

Cost calculator

Estimate what Hermes 4 405B would cost for your usage.

Input tokens / requestOutput tokens / requestRequests / month

$0.00250

per request

$25

estimated / month

Based on Hermes 4 405B's $1.00/1M input · $3.00/1M output. Estimate only — actual cost varies by provider and caching.

Download & self-host Hermes 4 405B

This is an open-weight model. Download the weights from Hugging Face or load it directly with Transformers.

406B

Parameters (safetensors)

245

Monthly downloads

Hugging Face likes

Download · transformers

# Install the Hugging Face CLI
pip install -U "huggingface_hub[cli]"

# Download the model weights
hf download NousResearch/Hermes-4-405B

# Or load it directly in Python
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("NousResearch/Hermes-4-405B")
model = AutoModelForCausalLM.from_pretrained("NousResearch/Hermes-4-405B", device_map="auto")

View NousResearch/Hermes-4-405B on Hugging Face

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "nousresearch/hermes-4-405b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: nousresearch/hermes-4-405b

Editor's verdict

Our take on Hermes 4 405B

Hermes 4 405B is Nous Research's open-weight language models with a 131K-token context window.

At $3.00 per 1M output tokens, it is mid-priced for its class.

As an open-weight model you can self-host it (406B parameters) or call it through a hosted API.

Best suited to strong adherence to complex instructions and effective handling of 128k context.

Did you find this helpful?

Frequently asked questions

The model supports a context window of 131072 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Other Hermes models

Sibling versions in the Hermes family from Nous Research.

Hermes 4 70B

Nous Research · Language Models

Verified

Open-weight 70B LLM built for long-context text tasks.

Open131K ctx$0.40/1M out

Promote Hermes 4 405B

Add this badge to your website, or share the tool.

DFeatured on DhanasviHermes 4 405B 1

Hermes 4 405B

About Hermes 4 405B

Capabilities