On base rates and the “accuracy” of computerized Facebook gaydar
I never know what to make of reports stating the “accuracy” of some test or detection algorithm. Take this example, from a New York Times article by Steve Lohr titled How Privacy Vanishes Online:
In a class project at the Massachusetts Institute of Technology that received some attention last year, Carter Jernigan and Behram Mistree analyzed more than 4,000 Facebook profiles of students, including links to friends who said they were gay. The pair was able to predict, with 78 percent accuracy, whether a profile belonged to a gay male.
I have no idea what “78 percent accuracy” means in this context. The most obvious answer would seem to be that of all 4,000 profiles analyzed, 78% were correctly classified as gay versus not gay. But if that’s the case, I have an algorithm that beats the pants off of theirs. Are you ready for it?
Say that everybody is not gay.
Figure that around 5 to 10 percent of the population is gay. If these 4,000 students are representative of that, then saying not gay every time will yield an “accuracy” of 90-95%.
But wait — maybe by “accuracy” they mean what percentage of gay people are correctly identified as such. In that case, I have an algorithm that will be 100% accurate by that standard. Ready?
Say that everybody is gay.
You can see how silly this gets. To understand how good the test is, you need two numbers: sensitivity and specificity. My algorithms each turn out to be 100% on one and 0% on the other. Which means that they’re both crap. (A good test needs to be high on both.) I am hoping that the MIT class’s algorithm was a little better, and the useful numbers just didn’t get translated. But this news report tells us nothing that we need to know to evaluate it.