What is F1 Score?
The F1 Score is a single metric that balances precision and recall to evaluate how well a classification model performs, especially when classes are uneven.
It is the harmonic mean of precision (how many predicted positives are actually correct) and recall (how many actual positives were found), giving equal weight to both.
The formula is F1 = 2 * (precision * recall) / (precision + recall). It ranges from 0 to 1, with 1 being perfect performance.
Unlike accuracy, F1 is robust to class imbalance because it penalizes models that favor one class too heavily.
Example
In a medical test for a rare disease, a model might achieve high accuracy by always saying 'no disease,' but its F1 score would be low because it misses the few actual cases.
Why it matters
F1 Score is widely used today to fairly assess models on imbalanced real-world data such as fraud detection, medical diagnosis, and content moderation.
Frequently asked questions
It depends on the task, but scores above 0.7 are often considered decent and above 0.9 excellent for many classification problems.
Related terms
Precision is an evaluation metric for classification models that measures the proportion of true positive predictions among all positive predictions made.
Recall is an evaluation metric that measures the proportion of actual positive cases a model correctly identifies. It shows how well the model finds all relevant instances in the data.
Accuracy measures the proportion of correct predictions made by a machine learning model out of all predictions. It is calculated as the number of correct predictions divided by the total number of predictions.
A confusion matrix is a table that shows how well a classification model performs by comparing its predictions to the actual labels.
A benchmark is a standardized dataset and task used to measure and compare how well different AI models perform.
BLEU Score is an automatic metric that evaluates machine-generated text quality, mainly for machine translation, by measuring overlap with human-written reference translations.