Introduction to Bayes Theorem
Bayes’
Theorem is stated mathematically as:
p(A|B) = [ p(B|A) x p(A) ] / p(B)
Where A & B are events, and p(B) ≠ 0. An event is something
that can be true or
false, for example, that the next person you see is bald, or
is male.
p(A|B) and p(B|A) are conditional
probabilities, the likelihood of
event A
occurring, given that B is true, and v.v. This is stated briefly as the probability of A given B.
p(A) and p(B) are the marginal
probabilities of observing A and
B, independently
of each other: for
example, the proportion of bald people, or of males.
Among other
uses, Bayes’ Theorem provides an improved method of
assessing the likelihood of
two non-independent events occurring simultaneously.
Example: Drug Testing
Suppose a
urine test used to detect the presence of a particular
banned drug is 99% sensitive
and 99% specific. That is, the test will provide 99% true positive results for drug users, and 99%
true negative results for non-users. Suppose
further than 0.5% of
the population tested are drug users (incidence).
We ask: What is the
probability that an
individual who tests
positive is a user?
Bayes’ Theorem phrases this as, what is p(User|+) ? Let p(A) = p(User)
and p(B) = p(+), then
p(User|+) = [ p(+|User) x p(User) ] / p(+)
Here, p(+|User)
estimates sensitivity,
that 0.99
of Users tested will be detected, and [1 - p(+|Non-User)]
incorporates specificity,
that only (1 – 0.99)
= 0.01 of Non-Users
will be reported
(incorrectly) as Users.
Then, p(+) estimates the total number of positive
tests, including true and false positives. The two components
are
p(+) = [ p(+|User) x p(User) ] + [ p(+|Non-User) x p(Non-User)
]
Keeping the
same number formats as defined above
So that
p(User|+) = [ p(+|User)
x p(User) ] /
p(+) = (0.99 x 0.005) / [(0.99)(0.005) + (1 - 0.99)(1
- 0.005)] = 0.3322
That is, even if an individual tests positive,
it is twice as likely as not (1 – 33.22% = 66.78%)
that s/he is not
a User. Why? Even
though the test appears to be highly “accurate” (99%
sensitivity & specificity),
the number of non-Users is very large
compared to the number of Users. Under such a condition, the
count of false
positives outweighs the count of true positives. For
example, if 1,000
individuals are tested, we expect 995 non-Users and 5 Users.
Among the 995
non-Users, we expect 0.01 x 995 ≈ 10 false
positives. Among the 5 Users, we expect 0.99 x 5 = 5 true
positives. So, out
of 15 positive tests, only 5 (33%) are genuine. The test
cannot be used to
screen for Users
What are the effects of improving “accuracy” of
the test? If sensitivity
is increased
to 100%, and specificity remains at 99%, p(User|+) = 33.44%, a miniscule improvement.
Alternatively, if sensitivity
remains at 99% and
specificity is increased
to 99.5%, then p(User|+)
= 49.87%, and
about half the positive tests are reliable.
HOMEWORK: Write an Excel spreadsheet program to
calculate p(User|+) for various
values of Sensitivity,
Specificity, and Incidence. Us the base values above as a
starting point. Under
what circumstances is the test most “useful”? Explain.