What is Receiver Operating Characteristic Area Under Curve (ROC-AUC)?
ROC-AUC, or Receiver Operating Characteristic Area Under Curve, is a performance measurement for classification problems in machine learning. The ROC curve is a graphical representation that illustrates the performance of a binary classifier model at varying threshold values. It plots the true positive rate (TPR) against the false positive rate (FPR) at different classification thresholds.
The AUC, or Area Under the Curve, measures the entire two-dimensional area underneath the entire ROC curve. It provides an aggregate measure of performance across all possible classification thresholds. AUC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUC of 0.0; one whose predictions are 100% correct has an AUC of 1.0.
One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example. In other words, it represents the model's ability to distinguish between the classes. A higher AUC indicates that the model is better at predicting 0 classes as 0 and 1 classes as 1.
However, it's important to note that AUC is scale-invariant and classification-threshold-invariant. It measures how well predictions are ranked, rather than their absolute values, and it measures the quality of the model's predictions irrespective of what classification threshold is used. These characteristics can be desirable, but they may also limit the usefulness of AUC in certain use cases.
For example, scale invariance might not be desirable when we need well-calibrated probability outputs, and AUC won't tell us about that. Similarly, classification-threshold invariance might not be desirable in cases where there are wide disparities in the cost of false negatives versus false positives.