What is Part-of-Speech Tagging?
Also known as: POS
Part-of-Speech Tagging (POS tagging) is the NLP task of labeling each word in a sentence with its grammatical category, such as noun, verb, adjective, or adverb.
It works by analyzing both the word itself and its surrounding context to decide the correct tag, since many words can belong to multiple categories depending on usage.
Traditional approaches use rule-based systems or statistical models like Hidden Markov Models, while modern methods rely on machine learning and neural networks trained on annotated corpora.
POS tagging is usually an early step in NLP pipelines that enables higher-level tasks by providing syntactic structure information.
Example
In the sentence "The quick brown fox jumps", POS tagging would label "The" as determiner, "quick" and "brown" as adjectives, "fox" as noun, and "jumps" as verb.
Why it matters
POS tagging is a foundational NLP technique that improves accuracy in applications like machine translation, sentiment analysis, and chatbots by helping systems understand sentence structure.
Frequently asked questions
Common tags include NN for noun, VB for verb, JJ for adjective, and DT for determiner, following standards like the Penn Treebank tagset.
Related terms
Tokenization is the process of breaking text into smaller units called tokens that language models can process numerically.
Named Entity Recognition (NER) is a natural language processing task that automatically finds and classifies specific names and terms in text into categories like people, organizations, locations, or dates.
A Transformer is a neural network architecture that processes sequential data like text using self-attention to weigh relationships between all parts of the input at once.
Beam search is a decoding algorithm used in NLP to generate sequences like sentences by exploring multiple high-probability paths instead of just one.
An embedding (or vector embedding) is a way to represent words, sentences, or other data as dense numerical vectors in a high-dimensional space so that similar items end up close together.
Greedy decoding is a text generation strategy in NLP where, at each step, the model selects the single token with the highest probability as the next output.