Inference & Endpoints
AI API Providers
Where to run AI models in production. Compare pricing, speed, free credits and supported models across inference providers.
Anthropic API
pay-per-token
Official API for the Claude model family, with prompt caching and strong long-context support.
AWS Bedrock
pay-per-token
Amazon's managed service for accessing multiple foundation models with enterprise security, guardrails and agents.
Azure OpenAI
pay-per-token
Enterprise access to OpenAI models with Azure's compliance, networking and regional deployment options.
Fireworks AI
pay-per-token
Fast, production-grade inference for open models with FireAttention optimisation and fine-tuning.
Google Vertex AI
pay-per-token
Google Cloud's enterprise AI platform for Gemini and open models with grounding and MLOps tooling.
Groq
pay-per-token
LPU-based inference provider delivering extremely fast token throughput for open models.
OpenAI API
pay-per-token
Direct access to OpenAI's GPT, reasoning, image and audio models via a mature, widely supported API.
OpenRouter
pay-per-token
A single API and marketplace that routes requests across hundreds of models and providers with automatic fallbacks.
Replicate
pay-per-second
A platform to run and deploy thousands of open models — especially image, video and audio — via a simple API.
Together AI
pay-per-token
Inference cloud for 200+ open models with fine-tuning and dedicated GPU endpoints.