What is the largest context window in the list?

Ling-2.6-flash offers the largest context at 262144 tokens.

Which models support multimodal inputs?

Gemma 3 4B and Ministral 3 3B 2512 are the open-weight models noted for native text and image support.

Cheapest AI Models

This ranked list spotlights the cheapest AI models, all priced at or below $0.1 per million output tokens. Key factors to weigh include output price, context window size, parameter count, open-weight availability, and modality support alongside noted strengths like efficiency or long-context handling. Rankings prioritize the lowest costs while reflecting documented capabilities and limitations from the provided details.

Ling-2.6-flash

LLM · $0.03/1M

View

It secures second place at $0.03 per 1M tokens with the largest context window of 262144 tokens, making it suitable for extended text tasks despite being proprietary.

Intelligence: 26.2Output price: $0.03/1MContext: 262KType: Proprietary

Mistral Nemo

LLM · $0.03/1M

View

It ranks third due to its $0.03 per 1M output price, 131072 context, and 12B open-weight design that delivers strong efficiency for model size along with solid multilingual performance.

Output price: $0.03/1MContext: 131KParams: 12BType: Open-weight

Llama 3.1 8B Instruct

LLM · $0.03/1M

View

It earns the top spot with the lowest output price of $0.03 per 1M tokens, 131072 context, and open weights that support efficient inference on consumer hardware plus strong instruction adherence.

Output price: $0.03/1MContext: 131KParams: 8BType: Open-weight

Llama 3 8B Lunaris

LLM · $0.05/1M

View

It places fourth with a $0.05 per 1M output price, 8B parameters, and proprietary tuning that enables coherent multi-turn responses and efficient consumer hardware inference.

Output price: $0.05/1MContext: 8KParams: 8BType: Proprietary

MythoMax 13B

LLM · $0.06/1M

View

Open-weight LLM for creative text and conversation tasks.

Output price: $0.06/1MContext: 4KParams: 13BType: Open-weight

Mistral Small 3

LLM · $0.08/1M

View

It earns fifth position at $0.08 per 1M tokens with 163.08 t/s output speed, 32768 context, and open weights that provide cost-effective performance on everyday language tasks.

Intelligence: 12.7Output speed: 163 t/sOutput price: $0.08/1MContext: 33K

Qwen3 235B A22B Thinking 2507

LLM · $0.10/1M

View

It ranks sixth at $0.1 per 1M tokens with 262144 context and 235B open-weight MoE design that supports strong reasoning, coding, and multilingual capabilities.

Output price: $0.10/1MContext: 262KParams: 235BType: Open-weight

Qwen3 235B A22B Instruct 2507

LLM · $0.10/1M

View

Open-weight instruct LLM built for long-context multilingual tasks.

Intelligence: 25Output speed: 60 t/sOutput price: $0.10/1MContext: 262K

Gemma 3 4B

Multimodal · $0.10/1M

View

It places seventh at $0.1 per 1M tokens with 131072 context, 4B open weights, and multimodal text-image support that enables compact local deployment.

Intelligence: 6.3Output price: $0.10/1MContext: 131KParams: 4B

Ministral 3 3B 2512

Multimodal · $0.10/1M

View

It ranks eighth at $0.1 per 1M tokens with 131072 context, 3B open weights, and native text-image support in a compact size for efficient deployment.

Output price: $0.10/1MContext: 131KParams: 3BType: Open-weight

Qwen2.5 7B Instruct

LLM · $0.10/1M

View

It earns ninth place at $0.1 per 1M tokens with 131072 context, 7B open weights, and strong coding plus math performance for its size along with efficient consumer hardware inference.

Output price: $0.10/1MContext: 131KParams: 7BType: Open-weight

Granite 4.1 8B

LLM · $0.10/1M

View

It closes the list at $0.1 per 1M tokens with 131072 context, 118.95 t/s speed, and 8B proprietary parameters tuned for enterprise use and strong long-context handling.

Intelligence: 12.4Output speed: 119 t/sOutput price: $0.10/1MContext: 131K

How we ranked this list

Ranked by lowest output price per million tokens. Data is pulled from live sources and refreshed continuously by Dhanasvi's autonomous agents — so this ranking stays current as new options launch and prices change.

Frequently asked questions

Llama 3.1 8B Instruct, Ling-2.6-flash, and Mistral Nemo are all listed at $0.03 per 1M output tokens.

How we ranked this list

Frequently asked questions

Which models share the lowest output price?

What is the largest context window in the list?

Which models support multimodal inputs?