Skip to content

The ROC Curve and AUC

What ROC is

ROC stands for Receiver Operating Characteristic.

It’s a curve that shows the tradeoff between:

  • TPR (True Positive Rate) = recall = TP / (TP + FN)TP / (TP + FN)
  • FPR (False Positive Rate) = FP / (FP + TN)FP / (FP + TN)

As you change the probability threshold, TPR and FPR change.

false


  flowchart LR
  S[Model outputs scores/probabilities] --> T[Choose threshold]
  T --> R[Compute TPR and FPR]
  R --> C[Plot ROC curve]

false

How to interpret the curve

  • A curve closer to the top-left is better.
  • The diagonal line is a “random guess” baseline.

AUC

AUC is the Area Under the ROC Curve.

  • AUC = 1.0 → perfect ranking
  • AUC = 0.5 → random

AUC measures how well the model ranks positives higher than negatives.

When ROC/AUC is useful

  • good for comparing classifiers independent of a single threshold
  • useful when you care about ranking quality

When ROC can mislead

With highly imbalanced data, PR curves (precision-recall) can be more informative.

Scikit-learn example

Compute ROC-AUC
from sklearn.metrics import roc_auc_score
 
# y_score is probability for the positive class
auc = roc_auc_score(y_true, y_score)
print("ROC-AUC:", auc)
Compute ROC-AUC
from sklearn.metrics import roc_auc_score
 
# y_score is probability for the positive class
auc = roc_auc_score(y_true, y_score)
print("ROC-AUC:", auc)

Mini-checkpoint

If a model has high accuracy but ROC-AUC ~ 0.5, what might be happening?

  • The model may be predicting the majority class and not learning useful ranking.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did