Average Precision

Average Precision in Object Detection Metrics

Understanding accuracy for object detection requires an insight into the outcomes from the detection model. In object detection, a model is used to predict either a positive or negative class, and these predictions can either be right or wrong. Let's use a situation where we are determining the existence of cats in an image as an example - the positive class might be "Cat," while the negative class could be "No Cat". Correct forecasts are termed true predictions, while incorrect ones are called false predictions.

For instance, True positive implies that the model accurately detected the presence of a tree. False-positive suggests the model wrongly assumed the presence of a tree. False-negative means the model wrongly predicted the absence of a tree, and True negative is when the model accurately detected the absence of a tree. Factors like the quality and quantity of training sets, the input image, the hyperparameters, and the accuracy requirement threshold affect the efficiency of an object detection model.

The Intersection over Union (IoU) ratio is used to decide if a predicted result is correct or not. IoU ratio measures the extent of overlap between the bounding box around a predicted object and the bounding box around the reference ground data.

Let's get into some vital metrics. Precision, defined as the ratio of correct positives divided by the total number of predictions made. So, if the model identified 100 cats and got 90 right, the precision is 90%.

Precision = (True Positive)/(True Positive + False Positive)

Recall, on the other hand, is the ratio of positive cases to the total number of true (relevant) cases. If the model correctly identifies 80 trees in an image where there are actually 100, the recall is 80%.

Recall = (True Positive)/(True Positive + False Negative)

The F1 score combines precision and recall in a weighted manner, the value of which ranges from 0 to 1, with 1 representing perfect precision.

F1 score = (Precision × Recall)/[(Precision + Recall)/2]

Precision-recall curve is a chart displaying precision (y-axis) and recall (x-axis), and it is a determinant of the performance of an object detection model. When the precision remains high with an increase in recall, it is considered an effective predictive model.

Average Precision (AP) is calculated by averaging the precision over all recall levels ranging from 0 to 1 at different IoU thresholds. AP can be described as the area under the precision-recall curve by interpolating over all points.

Mean Average Precision (mAP) is derived from the average AP over various IoU thresholds.


Precision in isolation refers to the correctness of a decision at a specific decision threshold. For instance, we might consider all model outputs less than 0.5 as negative and all outputs more than 0.5 as positive. Given certain circumstances (unbalanced classes or the need to prioritize precision over recall or vic versa), this threshold may need adjustment. Average precision, similar to the area under the precision-recall curve, provides the average precision at all possible thresholds. It is a reliable indicator for comparing how well models organize predictions, without binding to any specific decision threshold. A model with an average precision of 0.5 is said to deliver "balanced" predictions. At times, models with frequent wrong selections - "expertly terrible" models - can prove useful if their judgments are reversed.

Integrate | Scan | Test | Automate

Detect hidden vulnerabilities in ML models, from tabular to LLMs, before moving to production.