What are typical tasks solved with unsupervised learning?

Common tasks include clustering similar data points, reducing the number of features, detecting outliers, and finding association rules.

Can unsupervised learning make predictions?

It is mainly used for exploration and pattern discovery rather than direct prediction, though its outputs can later support supervised models.

What is Unsupervised Learning?

Unsupervised learning is a machine learning method that trains models on unlabeled data to find hidden patterns, structures, or relationships without any guidance on correct outputs.

Unlike supervised learning, unsupervised algorithms receive no labeled examples or target answers. They explore the data on their own to group similar items, reduce complexity, or spot anomalies.

Common techniques include clustering (such as k-means), dimensionality reduction (such as PCA), and association rule learning. These methods rely on statistical properties like distance, density, or co-occurrence within the data.

The goal is often exploratory: to summarize large datasets, discover natural groupings, or prepare data for further analysis rather than to make explicit predictions.

Example

An online retailer feeds customer purchase histories into an unsupervised algorithm that automatically groups shoppers into segments such as 'budget buyers' or 'premium shoppers' without any pre-labeled categories.

Why it matters

Most real-world data lacks labels, so unsupervised learning enables scalable exploration, feature discovery, and preprocessing for the massive unlabeled datasets common in modern AI applications.

Frequently asked questions

Supervised learning uses labeled data with known correct answers, while unsupervised learning works with unlabeled data and must discover structure on its own.

Related terms

Supervised Learning

Supervised learning is a machine learning method where a model is trained on data that already has correct answers attached, so it can learn to predict those answers for new data.

Clustering

Clustering is an unsupervised machine learning technique that automatically groups similar data points together into clusters based on their features, without using any labeled examples.

Semi-Supervised Learning

Semi-supervised learning is a machine learning approach that combines a small amount of labeled data with a large amount of unlabeled data to train models more effectively than using either alone.

Reinforcement Learning

Reinforcement Learning (RL) is a machine learning method where an agent learns to make sequential decisions by interacting with an environment, receiving rewards or penalties, and aiming to maximize its long-term reward.

Adam Optimizer

Adam (Adaptive Moment Estimation) is a popular optimization algorithm used to train machine learning models by iteratively updating parameters based on gradients.

Classification

Classification is a supervised machine learning task that assigns input data to one of several predefined categories or classes based on patterns learned from labeled training examples.