Skip to content
Sign in

What is Beam Search?

Beam search is a decoding algorithm used in NLP to generate sequences like sentences by exploring multiple high-probability paths instead of just one.

It maintains a fixed number (the beam width) of the most promising partial sequences at each generation step, scoring them based on the model's probability outputs.

At every time step, each sequence in the beam is expanded with possible next tokens; the algorithm then keeps only the top-scoring candidates and discards the rest.

This approach balances quality and efficiency, avoiding both the shortsightedness of always picking the single best token and the computational cost of checking every possible sequence.

Example

When translating 'Hello world' to French, a beam width of 3 might keep the top three partial translations at each step and ultimately select the full sentence with the highest overall probability rather than the first word that looked best.

Why it matters

Beam search is widely used in production NLP systems for machine translation, summarization, and chatbots because it produces more fluent and accurate text than simpler decoding methods while remaining practical to run.

Frequently asked questions

Beam width (or beam size) is the number of candidate sequences kept at each step; larger widths explore more options but increase computation.