What is Epoch?
An epoch is one complete pass of a machine learning model through the entire training dataset during training.
In practice, datasets are usually split into smaller batches. The model updates its parameters after each batch, and an epoch finishes only after every batch has been processed once.
Training typically requires multiple epochs so the model can gradually improve its weights. The exact number is a tunable hyperparameter that affects how well the model learns patterns versus memorizing noise.
Monitoring metrics like loss or accuracy on a validation set helps decide when to stop, preventing underfitting from too few epochs or overfitting from too many.
Example
A dataset of 60,000 images is divided into batches of 32. One epoch occurs after the model has processed all 1,875 batches, seeing every image exactly once.
Why it matters
The number of epochs controls how thoroughly a model learns from data and directly influences final accuracy and training cost. Proper epoch selection is essential for building effective, generalizable AI systems today.
Frequently asked questions
An iteration is one parameter update using a single batch, while an epoch is the full pass through all batches in the dataset.
Related terms
Batch size is the number of training examples processed together in a single forward and backward pass during model training.
Overfitting happens when a machine learning model learns the training data too closely, including its noise and quirks, so it fails to perform well on new, unseen data.
Gradient descent is an optimization algorithm that finds the minimum of a function by repeatedly moving in the direction of the steepest downward slope. In machine learning it is used to minimize a model's error by adjusting parameters step by step.
A hyperparameter is a value or setting chosen by the user before training a machine learning model that controls the learning process itself.
Data augmentation is a technique that artificially increases the size and diversity of a training dataset by creating modified versions of existing data samples.
A dataset is a structured collection of data points used to train, validate, or test machine learning models.