What is Federated Learning?
Federated learning is a machine learning technique that trains models across many decentralized devices or servers, each holding its own local data, without ever moving the raw data to a central location.
In federated learning, a central server first sends an initial model to participating devices. Each device then trains the model locally using only its own private data and computes updates such as gradients or parameter changes.
These updates are sent back to the server, which aggregates them (often by averaging) to improve a shared global model. The updated global model is then redistributed to devices for the next round, repeating until the model converges.
The approach emphasizes data privacy, reduces the need to transfer large datasets, and works well when data is naturally distributed or sensitive, such as on mobile phones or in hospitals.
Example
A smartphone keyboard app improves its next-word prediction model by training on each user's typing habits locally; only the model updates are shared with the central server, never the actual messages typed by users.
Why it matters
It enables large-scale training on private, distributed data while complying with privacy regulations and reducing data-transfer costs, making it essential for mobile, edge, and healthcare AI applications today.
Frequently asked questions
Yes, raw data stays on the device; only model updates are shared, though extra techniques like differential privacy can add further protection.
Related terms
Differential privacy is a mathematical framework that adds controlled random noise to data or query results so that the inclusion or exclusion of any single individual's information has only a negligible effect on the output.
Batch size is the number of training examples processed together in a single forward and backward pass during model training.
Chunking is the process of breaking large datasets, documents, or files into smaller, fixed-size or semantically meaningful segments. It is a common data preprocessing step in AI/ML pipelines to manage memory and enable efficient processing.
Cosine similarity measures how similar two vectors are by computing the cosine of the angle between them, ignoring their magnitudes.
Data augmentation is a technique that artificially increases the size and diversity of a training dataset by creating modified versions of existing data samples.
Data labeling is the process of adding tags or annotations to raw data so that machine learning models can learn from it during training.