where and are events and .
Suppose a blood test used to detect the presence of a particular banned drug is 99% sensitive and 99% specific. That is, the test will produce 99% true positive results for drug users and 99% true negative results for non-drug users. Suppose that 0.5% of people are users of the drug. What is the probability that a randomly selected individual who tests positive is a user?
Even if an individual tests positive, it is more likely than not (1 - 33.2% = 66.8%) that s/he does not use the drug. Why? Even though the test appears to be highly accurate, the number of non-users is very large compared to the number of users. Then, the count of false positives will outweigh the count of true positives.
To see this with actual
numbers, if 1,000 individuals are tested, we expect 995
non-users and 5 users. Among the 995 non-users,
0.01 × 995 ≃ 10 false positives are
expected. Among the 5 users,
0.99 × 5 ≈ 5 true positives are
expected. Out of 15 positive results, only 5, ~33%, are genuine.
The importance of specificity
in this example can be seen by calculating that even if
sensitivity is improved to 100%, but specificity remains
at 99%, then the probability that a person who tests
positive is a drug user only rises from 33.2% to 33.4%.
Alternatively, if sensitivity remains 99%, but
specificity is improved to 99.5%, then the probability
that a person who tests positive is a drug user rises to about