ROC and AUC

A receiver operating characteristic (ROC) curve is a graph that looks at the performance of a diagnostic test by plotting the true positive rate against the false positive rate. In other words, itโ€™s a way to evaluate how well a particular test works, with high AUC values indicating the test works well. Read more below.

For example, letโ€™s say weโ€™re looking at how well an artificial intelligence (AI) model can predict live birth from a set of blastocyst images. We can plot the times it correctly identifies a live birth (true positive) against the times it incorrectly predicts a live birth (false positive).

What weโ€™re hoping for is a model that can maximize the identification of true positives, while minimizing false positives. If we have a bunch of different AI models, we can see how each of them perform with their own ROC curves.

To compare ROC curves, we can measure the area under the curve (AUC) for each ROC curve.

An AUC of 0.5 means the model is no better than random chance at discriminating true positives from false positives, and an AUC of 1 means the model can perfectly discriminate true positives from false positives.

If two AI models predict live birth from blastocyst images and one has a higher AUC than the other, the model with the higher AUC will predict more true positives, fewer false positives, and is generally better at predicting live births. If a model has an AUC of 0.77, that means is can correctly predict live birth 77% of the time.