Skip to content
GPT-4o (2024-08-06) logo

GPT-4o (2024-08-06)

Verified

Multimodal model optimized for integrated text, image, and file tasks.

OpenAIMultimodalClosedII 14.5
Vision
Model page
Updated 2026-06-15

About GPT-4o (2024-08-06)

Built as a proprietary system, GPT-4o combines vision and language processing in a single architecture. It supports file uploads alongside images and text for coherent multi-turn interactions. The design prioritizes low-latency responses while maintaining a large context capacity.

Key strengths include accurate visual analysis paired with textual reasoning and document handling. It performs well on tasks that require cross-referencing images with long-form content. Common uses range from API-driven applications to chat interfaces for research, creative work, and data extraction.

Capabilities

Multimodal text and image understanding
Long-context reasoning
Code generation and analysis
File and document interpretation
Vision-based reasoning
Natural language generation

Benchmarks & performance

Independent evaluation scores and measured speed.

14.5
Intelligence Index
24.2
Coding Index
102
Tokens / sec
0.89s
Time to first token

Source: Artificial Analysis

How GPT-4o (2024-08-06) compares

GPT-4o (2024-08-06) (striped bar) vs other multimodal on intelligence, speed and price.

Intelligence

Artificial Analysis Intelligence Index · Higher is better · GPT-4o (2024-08-06) ranks #76 of 88

16
Qwen3 VL 30B A3B Instruct
16
Sonar
15
Sonar Pro
15
GLM 4.5V
15
GPT-4o
15
GPT-4o
15
GPT-4o
15
GPT-4o
14
Qwen3 VL 8B Instruct
14
GPT-4 Turbo
14
Llama 4 Scout
13
GPT-4.1 Nano
13
GPT-4o-mini

Speed

Output tokens per second · Higher is better · GPT-4o (2024-08-06) ranks #39 of 76

116
GPT-5.1
112
Qwen3 VL 30B A3B Instruct
105
o1
102
Llama 4 Scout
102
GPT-4o
102
GPT-4o
102
GPT-4o
102
GPT-4o
101
GPT-5.3-Codex
98
GPT-4.1 Mini
97
GPT-5 Mini
96
Llama 4 Maverick
92
Qwen3.5-27B

Price

USD per 1M output tokens · Lower is better · GPT-4o (2024-08-06) ranks #112 of 155

$10.0
Gemini 2.5 Pro Preview 06-05
$10.0
GPT-5.1
$10.0
GPT-5
$10.0
GPT-5.1-Codex-Max
$10.0
GPT-4o
$10.0
GPT-4o
$10.0
GPT-4o
$10.0
GPT-5 Chat
$10.0
GPT-5.1 Chat
$12.0
Gemini 3.1 Pro Preview
$12.0
Gemini 3.1 Pro Preview Custom Tools
$12.0
Google Gemini Pro Latest
$12.5
Nova Premier 1.0

Sources: Artificial Analysis (intelligence, speed) · OpenRouter (price).

Best for

Multimodal Image and Text Analysis

The model excels at vision-based reasoning tasks that combine image inputs with textual queries, such as extracting insights from charts, diagrams, or photographs alongside explanatory text.

Long-Context Document Processing

It handles extended documents and conversations up to its full context length, enabling coherent summarization, analysis, and question-answering across large files or multi-turn interactions.

Code Generation and Technical Workflows

Strong performance in code generation, debugging, and analysis makes it suitable for software development tasks that also require interpreting related documentation or visual mockups.

Strengths & limitations

Strengths

  • +Strong cross-modal integration
  • +Versatile across creative and analytical tasks
  • +Handles complex multi-step instructions well

Limitations

  • No native audio or video processing
  • Knowledge cutoff at training date
  • Can still produce hallucinations on edge cases

Cost calculator

Estimate what GPT-4o (2024-08-06) would cost for your usage.

$0.00750
per request
$75
estimated / month

Based on GPT-4o (2024-08-06)'s $2.50/1M input · $10.00/1M output. Estimate only — actual cost varies by provider and caching.

Quick start

OpenRouter's API is OpenAI-compatible — most SDKs work by just swapping the base URL. Only the model slug changes between models.

JavaScript · openai
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-2024-08-06",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

Model slug: openai/gpt-4o-2024-08-06

Editor's verdict

Our take on GPT-4o (2024-08-06)

GPT-4o (2024-08-06) is OpenAI's proprietary multimodal with a 128K-token context window.

On independent testing it scores 14.5 on the Artificial Analysis Intelligence Index, running at roughly 102 tokens per second with about 0.89s to first token.

At $10.00 per 1M output tokens, it is premium-priced for its class.

It is available through OpenAI's API and aggregators like OpenRouter.

Best suited to strong cross-modal integration and versatile across creative and analytical tasks.

Did you find this helpful?

Frequently asked questions

The model supports a context window of 128,000 tokens.

User reviews

Real, verified reviews from the community shape this model's rating.

Loading reviews…

Sign in to review

Other GPT models

Sibling versions in the GPT family from OpenAI.

Promote GPT-4o (2024-08-06)

Add this badge to your website, or share the tool.

DFeatured on DhanasviGPT-4o (2024-08-06) 1