Can precision be 100%?

Yes, but only if every positive prediction is correct; this often comes at the cost of missing many real positives.

Is high precision always better?

Not necessarily; it depends on the problem and whether false positives or false negatives are more harmful.

What is Precision?

Precision is an evaluation metric for classification models that measures the proportion of true positive predictions among all positive predictions made.

It is calculated as true positives divided by the sum of true positives and false positives. This focuses only on the cases the model labeled as positive.

A high precision score means the model makes few false positive errors when predicting the positive class. It does not consider how many actual positives were missed.

Precision is commonly paired with recall because improving one often reduces the other; the balance between them is summarized by the F1 score.

Example

In a spam filter, if the model flags 100 emails as spam and 90 of them are actually spam, precision is 0.90. This shows that 90% of its spam predictions were correct.

Why it matters

Precision is critical in applications where false alarms are costly or disruptive, such as fraud detection or medical screening, helping practitioners control the reliability of positive alerts.

Frequently asked questions

Precision measures correctness among predicted positives while recall measures how many actual positives were found.

Related terms

Recall

Recall is an evaluation metric that measures the proportion of actual positive cases a model correctly identifies. It shows how well the model finds all relevant instances in the data.

F1 Score

The F1 Score is a single metric that balances precision and recall to evaluate how well a classification model performs, especially when classes are uneven.

Accuracy

Accuracy measures the proportion of correct predictions made by a machine learning model out of all predictions. It is calculated as the number of correct predictions divided by the total number of predictions.

Confusion Matrix

A confusion matrix is a table that shows how well a classification model performs by comparing its predictions to the actual labels.

Benchmark

A benchmark is a standardized dataset and task used to measure and compare how well different AI models perform.

BLEU Score

BLEU Score is an automatic metric that evaluates machine-generated text quality, mainly for machine translation, by measuring overlap with human-written reference translations.