What is Underfitting?
Underfitting happens when a machine learning model is too simple to capture the patterns in the training data, leading to poor performance on both training and unseen data.
It occurs when the chosen model lacks sufficient complexity or capacity, such as using a linear model for nonlinear relationships, resulting in high bias and an inability to learn meaningful features from the data.
Key signs include high error rates during training and testing, as the model fails to approximate the true underlying function despite seeing the examples.
It is often addressed by increasing model complexity, adding more features, or reducing regularization to allow better fitting of the data patterns.
Example
Imagine using a straight line to predict house prices based on size when the real relationship is curved; the simple line misses key trends and makes inaccurate predictions for both known and new houses.
Why it matters
Underfitting produces unreliable AI systems that cannot generalize or deliver value, so recognizing and fixing it is essential for building effective models in applications like prediction and classification.
Frequently asked questions
Check if training error is high and similar to test error; this shows the model isn't learning the data patterns well.
Related terms
Overfitting happens when a machine learning model learns the training data too closely, including its noise and quirks, so it fails to perform well on new, unseen data.
Regularization is a set of techniques in machine learning that reduce overfitting by adding a penalty term to the model's loss function, discouraging overly complex or large parameter values.
Adam (Adaptive Moment Estimation) is a popular optimization algorithm used to train machine learning models by iteratively updating parameters based on gradients.
Classification is a supervised machine learning task that assigns input data to one of several predefined categories or classes based on patterns learned from labeled training examples.
Clustering is an unsupervised machine learning technique that automatically groups similar data points together into clusters based on their features, without using any labeled examples.
Gradient descent is an optimization algorithm that finds the minimum of a function by repeatedly moving in the direction of the steepest downward slope. In machine learning it is used to minimize a model's error by adjusting parameters step by step.