# Confidence Intervals without Your Collaborator's Tears

## Abstract

We provide an interpretation for the confidence interval for a binomial proportion hidden as the transcript of an hypothetical statistical consulting session.

This work is licensed under a Creative Commons
Attribution-ShareAlike 4.0 International License. The
markdown+Rknitr source code of this blog is available under a GNU General Public
License (GPL v3) license from github.

## The Statistical Consulting Session

**[Client]:** So I did this important experiment where I
investigated 420 turtles whether they had a specific phenotype trait. 52
out of my 420 turtles had the trait, i.e. the proportion in the sample
was 12.4%. Now I’d like to state a 95% confidence interval for what the
proportion of the trait is in the population of turtles (which is pretty
large).

**[Statistician]:** Interesting, but no so fast. Before
going into the statistical specifics I would like to talk about the
representativeness of your sample. Can you explain in more detail what
is the population of turtles you would like to make a statement about
and how the sample was selected.

**[Client]:** I’m surprised you care about turtles
(*eyes start to glow*). My research area is the indo-pacific
subpopulation of the green sea
turtle, in particular the one outside the Great Barrier Reef. The
population is believed to be a few thousands large and we sample by
catching turtles using the rodeo method at
specific feeding grounds. Each turtle was then tagged and a skin biopsy
was taken. From the biopsy it was determined if the turtle is clinically
unhealthy by measuring marker values and then check if its specific
value was outside the presented reference interval.

**[Statistician]:** Hmm, that’s a lot of information.
And what population do you want to make a statement about? All green
turtles?

**[Client]:** That would be nice, but no. There are
genetic differences between the varies subspecies so I would be happy
just to make a statement about the Great Barrier Reef population.

**[Statistician]:** That sounds fair. You might get some
bias by only sampling from feeding grounds and I’m unsure about the
spatial design of your capture strategy.

**[Client:]** We gave this a lot of thought and varied
both the time and site of sampling, but there is no hypothesis that
there is great spatial variation within the reef area.

**[Statistician]:** Well, it’s my job to check the
representativeness of the sample, but it sounds like you already gave
this great thought. From what you describe sampling appears to be
without replacement. However, since your population is large (and also
not really known), it appears acceptable to skip the finite population
correction. Since your sample is also pretty large, let’s therefore go
with the textbook large-sample confidence interval for a binomial
proportion, that is \(\hat{p} \pm 1.96
\sqrt{\hat{p}(1-\hat{p})/n}\), where \(\hat{p}=x/n\). Don’t worry about the
equation, let’s use R and the `binom`

package to compute the
interval:

`<- binom::binom.asymp(x=x, n=n, conf.level=0.95)) (ci `

```
## method x n mean lower upper
## 1 asymptotic 52 420 0.1238095 0.09231031 0.1553087
```

So the requested 95% CI is 9.2%- 15.5%. Happy?

**[Client]:** Well, yes, that looks nice, but I wonder
how I can interpret the confidence interval. Is it correct to say that
the interval denotes a region, which contains the true value with 95%
probability?

**[Statistician]:** A confidence interval constructing
procedure yields a random interval, because it depends on quantities (in
particular \(x\)), which are random.
However, once you plug-in the realization of the random variable
(i.e. your observation), you get a realization of the confidence
interval consisting of two numbers. Would we know the true value then we
could tell, whether the value is covered by this particular interval or
not. The specific interval we calculated above thus contains the true
value with probability 0% or 100% - which of the two is the cases we
unfortunately do not know.

The correct interpretation is thus that the confidence interval is constructed by a procedure, which, when you repeat the experiment many many times, is such that in 95% of the experiments the corresponding confidence interval would cover the true value. I can illustrate this using a small simulation study in R. Let’s assume your true proportion is 10%.

```
##Assumed true value of the proportion of turtles with the trait
<- 0.1
theta_true
##Prepare data.frame with 10000 realizations of the experiment
<- data.frame(x=rbinom(n=1e4, size=n, prob=theta_true))
df
##Compute 95% CI for each experiment
%<>% do( {
df as.data.frame(cbind(.,rate=.$x/n,binom::binom.asymp(x=.$x, n=n)[c("lower","upper")]))
})##Check if the interval covers the true value
%<>% mutate(covers_true_value = lower < theta_true & upper > theta_true)
df
##Proportion of intervals covering the true value. This would be 95%
%>% summarise(coverage = mean(covers_true_value)) df
```

```
## coverage
## 1 0.9446
```

A graphic for the first 50 experiments shows their point estimates, corresponding intervals and how they overlap the true value or not:

The specific confidence interval `ci`

we computed above is
thus just one of many possible confidence intervals originating from the
above procedure.

**[Client]:** Ok, I get that is what happens when you do
it many times. But I only have this one experiment with x = 52 out of n
= 420 subjects having the trait of interest. Your above output does
contain a very specific CI, with some very specific numbers, i.e. 9.2%-
15.5%. Is the true value in this interval with 95% probability?

**[Statistician]:** No, the realized interval either
covers the true value or not. But since we do not know the true value
it’s a bit like Schrödinger’s cat… To keep things very sketchy: The
width of the interval gives you an indication of your estimation
certainty, but the particular values are hard to interpret - except
maybe as the critical limits in a corresponding hypothesis test. The
later can loosely be interpreted as the range of parameters consistent
with the data, where *consistency* is defined by a pre-defined
type-I error probability.

**[Client]:** I’m sorry, but this is
**useless**! You suggest to follow a common statistical
protocol, you compute some numbers, but these numbers don’t really mean
anything?

**[Statistician]:** Sorry, that’s the way it is.
However, allow me to offer a different explanation: We both appear to
agree that there is a true value of the proportion, right? Let’s denote
this \(\theta\) with \(0 \leq \theta \leq 1\). We don’t know the
true value, but we might have some prior idea about the range of
plausible values for it. Would you be willing to characterize your
belief about this by a distribution for \(\theta\)? This could be something as simple
as assuming \(\theta \sim U(0,1)\),
i.e. you would assume that every value of \(\theta\) between zero and one is equally
probable initially. Then the interval we computed above denotes a 95%
equal tail probability **credibility region** in a Bayesian
framework when we assume this flat prior and use an asymptotic argument
resulting in the posterior being Gaussian with the same mean and
variance as the one would get from the asymptotic frequentist
procedure.

**[Client]:** Umm, ok, but how’s that helpful?

**[Statistician]:** In the Bayesian context a 95%
credibility region is a summary of your posterior belief about the
parameter. Hence, it is ok to interpret this interval as that your
belief after seeing the data is such that *the true value is in that
interval with 95% probability*.

**[Client]:** I love this Bayes thing! This is what I
wanted initially. But hang on, I don’t really believe in all values of
the parameter being initially equally probable. I’ve seen some previous
studies in the literature who under pretty similar conditions find that
proportion is around 5-15%. How would I incorporate that?

**[Statistician]:** You could modify the large-sample
Gaussian posterior such that it includes this prior information. Instead
of using Gaussians you could alternatively also use a Beta
distribution to perform so called **conjugate prior-posterior
updating** of your belief about the true proportion. Let’s say
the 5-15% denotes your prior 95% credibility region for the parameter.
We would use this to find the parameters of a beta distribution. Then we
would update your prior belief with the observed data to get the
posterior distribution. A feature of such a conjugate approach is that
this updated distribution, i.e. the posterior, is again beta. Doing this
in R is easy:

```
##Function to determine beta parameters s.t. the 2.5% and 97.5% quantile match the specified values
<- function(theta, prior_interval, alpha=0.05) {
target sum( (qbeta(c(alpha/2, 1-alpha/2), theta[1], theta[2]) - prior_interval)^2)
}
##Find the prior parameters
<- optim(c(10,10),target, prior_interval=c(0.05, 0.15))$par
prior_params prior_params
```

`## [1] 12.04737 116.06022`

Actually, you can interpret the above prior parameters as the number of turtles you have seen with the trait (12) and without the trait (116), respectively, before conducting your above investigation. The posterior parameters are then 12 + 52 = 64 turtles with the trait and 116 + 368 = 484 turtles without the trait. You get the credible interval based on this posterior in R by:

```
##Compute credibile interval from a beta-binomial conjugate prior-posterior approach
<- binom::binom.bayes(x=x, n=n, type="central", prior.shape1=prior_params[1], prior.shape2=prior_params[2])) (ci_bayes
```

```
## method x n shape1 shape2 mean lower upper sig
## 1 bayes 52 420 64.04737 484.0602 0.1168518 0.09134069 0.1450096 0.05
```

Graphically, your belief updating looks as follows:

```
##Plot of the beta-posterior
<- binom::binom.bayes.densityplot(ci_bayes)
p ##Add plot of the beta-prior
<- data.frame(x=seq(0,1,length=1000)) %>% mutate(pdf=dbeta(x, prior_params[1], prior_params[2]))
df + geom_line(data=df, aes(x=x, y=pdf), col="darkgray",lty=2) +
p coord_cartesian(xlim=c(0,0.25)) + scale_x_continuous(labels=scales::percent)
```

The dashed line in the graphic shows the density of your beta prior distribution. The shaded region shows the density of your beta posterior distribution, in particular, the gray shaded area denotes your 95% credibility region on the x-axis. One advantage of the beta-binomial approach is that this ensures that the prior as well as the posterior distribution have the same support as your parameter, i.e. the interval between zero and one.

**[Client]:** So I can really write that the interval
9.1%- 14.5% contains the true parameter with a probability of 95%?

**[Statistician]:** Depending on the technical level of
your readership you might want to state that it’s a 95% equi-tailed
credible interval resulting from a beta-binomial conjugate Bayesian
approach obtained when using a prior beta with parameters such that the
similar 95% equi-tailed **prior credible interval** has
limits 0.05 and 0.15. Given these assumptions the interval 9.1%- 14.5%
contains 95% of your subjective posterior density for the parameter.

**[Client]:** ???

**[Statistician]:** You might want to skip all the
details and write that the 95% **Bayesian confidence
interval** is 9.1%- 14.5%.

**[Client]:** And what if the reviewers do not agree
with my prior distribution?

**[Statistician]:** Then return to your flat prior,
i.e. the uniform. Nobody usually questions that even though you told me
that you don’t believe in it yourself:

`<- binom::binom.bayes(x=x, n=n, type="central", prior.shape1=1, prior.shape2=1)) (ci_bayes_unif `

```
## method x n shape1 shape2 mean lower upper sig
## 1 bayes 52 420 53 369 0.1255924 0.09574062 0.158803 0.05
```

**[Client]:** Alright, that’s settled then - I just use
a Bayesian confidence interval with a flat prior and am done. That was
not so hard after all. Now to something else, the reviewers of the
paper, which by the way has to be revised by tomorrow, also asked us to
substantiate our findings with a **p-value**. How would I
get a p-value related to the above?

**[Statistician]:** Look how late it already is! I
really need to go. I have this Bayesian Methods class to teach…