What is Precision-Recall curve (PR AUC)?

by Stephen M. Walker II, Co-Founder / CEO

What is Precision-Recall curve (PR AUC)?

The Precision-Recall (PR) curve is a graphical representation of a classifier's performance, plotted with Precision (Positive Predictive Value) on the y-axis and Recall (True Positive Rate or Sensitivity) on the x-axis. Precision is defined as the ratio of true positives (TP) to the sum of true positives and false positives (FP), while Recall is the ratio of true positives to the sum of true positives and false negatives (FN).

The PR curve is particularly informative for binary classification problems, especially when dealing with imbalanced datasets where one class is significantly underrepresented. In such cases, other metrics like accuracy can be misleading, as they can be dominated by the majority class.

The Area Under the PR Curve (PR AUC) is a single metric summarizing the information of the PR curve. It provides a measure of a model's performance across all classification thresholds. A perfect model would have a PR AUC of 1, indicating perfect precision and recall at all thresholds. Conversely, a model with no skill would have a PR AUC equal to the proportion of positive samples in the dataset.

The PR AUC is particularly useful when the positive class is of greater interest and the data is imbalanced. It is more sensitive to the performance improvements for the positive class compared to metrics like ROC AUC, which consider both positive and negative classes equally.

In practice, the PR curve is created by varying the threshold for predicting a positive or negative outcome and plotting the resulting precision and recall values. The PR AUC is then calculated as the area under this curve.

How is PR AUC different from ROC-AUC?

PR AUC and ROC AUC are metrics for assessing classification models, each sensitive to different model performance aspects. ROC AUC evaluates a model's discriminative ability between classes by plotting True Positive Rate (TPR) versus False Positive Rate (FPR) across thresholds, with an area of 1.0 signifying perfect classification. This metric is ideal for balanced datasets and when both classes are of equal importance.

Conversely, PR AUC focuses on a model's ability to identify positive cases, especially in imbalanced datasets. It plots Precision against Recall for various thresholds, where Precision is the ratio of true positives to all positive predictions, and Recall is the ratio of true positives to all actual positives. The area under this curve represents the PR AUC score, with 1.0 reflecting ideal precision and recall. PR AUC is preferred when the positive class is more critical, offering greater sensitivity to detecting positive class improvements.

More terms

What is Amazon Bedrock?

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models from leading AI companies like AI21 Labs, Anthropic, Cohere, and Stability AI, along with a broad set of capabilities for building generative AI applications with security, privacy, and responsible AI.

Read more

What is a restricted Boltzmann machine?

A restricted Boltzmann machine is a type of artificial intelligence that can learn to represent data in ways that are similar to how humans do it. It is a neural network that consists of two layers of interconnected nodes. The first layer is called the visible layer, and the second layer is called the hidden layer. The nodes in the visible layer are connected to the nodes in the hidden layer, but the nodes in the hidden layer are not connected to each other.

Read more

It's time to build

Collaborate with your team on reliable Generative AI features.
Want expert guidance? Book a 1:1 onboarding session from your dashboard.

Start for free