Skip to content
Sign in

What is Endpoint?

In AI/ML infrastructure, an endpoint is a deployed URL or network address that exposes a trained model so applications can send data and receive predictions via API calls.

An endpoint is created when a model is packaged and hosted on a server or cloud platform, turning the static model file into a live service that listens for requests.

When a request arrives (usually JSON over HTTP), the endpoint loads the model, runs inference on the input, and returns the output, often with added features like authentication, logging, and scaling.

Endpoints support versioning, A/B testing, and monitoring so teams can update models without breaking downstream apps.

Example

A mobile app sends a photo to https://api.company.com/v1/classify; the endpoint runs an image-classification model and instantly replies with the predicted labels and confidence scores.

Why it matters

Endpoints are the bridge that turns trained models into usable products, enabling real-time inference at scale in production systems.

Frequently asked questions

No. The model is the trained file; the endpoint is the running service that makes the model accessible over the network.