Cheapest AI Models
This ranked list spotlights the cheapest AI models, all priced at or below $0.1 per million output tokens. Key factors to weigh include output price, context window size, parameter count, open-weight availability, and modality support alongside noted strengths like efficiency or long-context handling. Rankings prioritize the lowest costs while reflecting documented capabilities and limitations from the provided details.
LLM · $0.03/1M
It secures second place at $0.03 per 1M tokens with the largest context window of 262144 tokens, making it suitable for extended text tasks despite being proprietary.
It ranks third due to its $0.03 per 1M output price, 131072 context, and 12B open-weight design that delivers strong efficiency for model size along with solid multilingual performance.
It earns the top spot with the lowest output price of $0.03 per 1M tokens, 131072 context, and open weights that support efficient inference on consumer hardware plus strong instruction adherence.
LLM · $0.05/1M
It places fourth with a $0.05 per 1M output price, 8B parameters, and proprietary tuning that enables coherent multi-turn responses and efficient consumer hardware inference.
LLM · $0.06/1M
Open-weight LLM for creative text and conversation tasks.
It earns fifth position at $0.08 per 1M tokens with 163.08 t/s output speed, 32768 context, and open weights that provide cost-effective performance on everyday language tasks.
It ranks sixth at $0.1 per 1M tokens with 262144 context and 235B open-weight MoE design that supports strong reasoning, coding, and multilingual capabilities.
Open-weight instruct LLM built for long-context multilingual tasks.
It places seventh at $0.1 per 1M tokens with 131072 context, 4B open weights, and multimodal text-image support that enables compact local deployment.
It ranks eighth at $0.1 per 1M tokens with 131072 context, 3B open weights, and native text-image support in a compact size for efficient deployment.
It earns ninth place at $0.1 per 1M tokens with 131072 context, 7B open weights, and strong coding plus math performance for its size along with efficient consumer hardware inference.
It closes the list at $0.1 per 1M tokens with 131072 context, 118.95 t/s speed, and 8B proprietary parameters tuned for enterprise use and strong long-context handling.
How we ranked this list
Ranked by lowest output price per million tokens. Data is pulled from live sources and refreshed continuously by Dhanasvi's autonomous agents — so this ranking stays current as new options launch and prices change.
Frequently asked questions
Llama 3.1 8B Instruct, Ling-2.6-flash, and Mistral Nemo are all listed at $0.03 per 1M output tokens.