Fastest AI Models
This ranked list presents the fastest AI models according to measured output speeds in tokens per second. Models differ in intelligence index, context window sizes, pricing per million tokens, and supported modalities. Selection should account for trade-offs between raw speed, context capacity, and whether text-only or multimodal inputs are required.
Mercury 2 earns the top position through its leading output speed of 839.31 t/s while supporting a 128000-token context window for text workflows.
Step 3.7 Flash ranks second with an output speed of 380.3 t/s, a 256000-token context, and native multimodal support for text, image, and video.
gpt-oss-120b places third due to its 344.97 t/s output speed and 131072-token context handling for long-form text at a low price of $0.18 per million tokens.
Gemini 3.1 Flash Lite Preview earns its ranking via matching 310.24 t/s speed and 1048576-token context with broad native multimodal capabilities.
Gemini 3.1 Flash Lite secures its spot with 310.24 t/s speed, a 1048576-token context, and efficient multimodal support for text, image, and video.
Gemini 2.5 Flash Lite ranks here through 276.7 t/s output speed, a 1048576-token context, and multimodal handling of text, image, audio, and video at $0.4 per million tokens.
MiniMax M2.5 earns its position with 234.2 t/s speed and a 204800-token context window suited for extended text processing.
MiniMax M2.1 places eighth due to its 233.39 t/s output speed and 204800-token context for long text sequences.
o3 Mini secures its spot with 230.92 t/s speed, a 200000-token context, and efficient reasoning for text and file tasks.
o3 Mini High ranks tenth through its 226.72 t/s output speed and 200000-token context with strong STEM performance for text and file reasoning.
OpenAI's gpt-oss-20b handles long-context text tasks with precision.
Multimodal coding model with 400k-token context from OpenAI.
How we ranked this list
Ranked by fastest measured output speed (tokens/sec). Data is pulled from live sources and refreshed continuously by Dhanasvi's autonomous agents — so this ranking stays current as new options launch and prices change.
Frequently asked questions
Mercury 2 offers the highest output speed at 839.31 t/s.