Skip to content
Sign in

What is Word Embedding?

Also known as: Word2Vec

Word embedding is a technique that represents words as dense numerical vectors in a continuous space, allowing machines to capture semantic relationships between words.

Instead of treating words as isolated symbols, word embeddings map each word to a point in a high-dimensional vector space where similar words end up close together based on their usage in text.

Methods like Word2Vec train a shallow neural network to predict a word from its surrounding context (or vice versa), learning vector representations that encode meaning and analogies.

This dense representation is much more efficient and meaningful than sparse methods like one-hot encoding, enabling models to generalize across related terms.

Example

In a trained embedding space, the vector for 'king' minus the vector for 'man' plus the vector for 'woman' results in a vector very close to 'queen'.

Why it matters

Word embeddings form the foundation of modern NLP systems, powering semantic search, machine translation, chatbots, and large language models by giving them a numerical understanding of language meaning.

Frequently asked questions

One-hot encoding creates sparse vectors with a single 1 and many zeros, while word embeddings create dense vectors that capture semantic similarity and relationships.