In our class we have n=11 students. Therefore, if we call \(y\) the number of students age 22 or above, we know that \(y\ \in \{0,1,\ldots,n\}\). With the additional independent and identically distributed (iid) assumpution, we have that \[ y|\theta \sim Binomial(n,\theta), \] where \(\theta=Pr(age \geq 22)\) is our parameter of interest. The likelihood is \[ L(\theta;\mbox{data}) \propto \theta^y(1-\theta)^{n-y}, \qquad\qquad \theta \in (0,1). \]
As you probably learned this in your introduction to statistics class, the MLE of \(\theta\) is \[ {\widehat \theta}_{mle}=\frac{y}{n}. \]
The approximate 95% confidence interval for \(\theta\) is given by \[ {\widehat \theta}_{mle} \pm 1.96 \sqrt{\frac{{\widehat \theta}_{mle}(1-{\widehat \theta}_{mle})}{n}}. \] As you can see in the figure below, the confidence intervals can collapse into a point or even generate negative probabilities, which are aspects of the normal approximation.
n = 11
y = 0:n
theta.mle = y/n
se = sqrt(theta.mle*(1-theta.mle)/n)
L.mle = theta.mle - 1.96*se
U.mle = theta.mle + 1.96*se
plot(y,L.mle,ylim=c(-0.1,1.1),pch=16,ylab="95% approximate confidence interval")
points(y,U.mle,pch=16)
abline(h=0,lty=3)
abline(h=1,lty=3)
for (i in 2:n)
segments(y[i],L.mle[i],y[i],U.mle[i])Suppose we want to use the information that about 60% of undergrads are between 19 and 24 years old, and decide to find a beta prior for \(\theta\) such that \[ \mbox{Mode}(\theta) = 0.10 \ \ \ \mbox{and} \ \ \ Pr(\theta>0.1) = 0.75, \] approximately. A possible beta prior is a \(Beta(1.12,2.12)\), which satisfies approximately the above prior information.
alpha =1.12
beta = 2.12
theta = seq(0,1,length=1000)
plot(theta,dbeta(theta,alpha,beta),type="l",ylab="Density",xlab=expression(theta))
title("Prior distribution")The Bayesian approach combines the prior \(\theta \sim Beta(\alpha,\beta)\) with the Binomial likelihood, which is proportional to \(\theta^y(1-\theta)^{n-y}\). The posterior is, therefore, \(Beta(\alpha_1,\beta_1)\), where \[\begin{eqnarray*} \alpha_1 &=& \alpha + y\\ \beta_1 &=& \beta + n - y. \end{eqnarray*}\] The posterior mode equals \[ \mbox{Mode}(\theta|\mbox{data}) = \frac{y+\alpha-1}{n+\alpha+\beta-2} \]
alpha1 = alpha+y
beta1 = beta+n-y
theta.bayes = (y+alpha-1)/(n+alpha+beta-2)
L.bayes = qbeta(0.025,alpha1,beta1)
U.bayes = qbeta(0.975,alpha1,beta1)
plot(theta,dbeta(theta,alpha,beta),type="l",ylab="Density",
xlab=expression(theta),ylim=c(0,10),lwd=2)
for (i in 1:12)
lines(theta,dbeta(theta,alpha+y[i],beta+n-y[i]),col=2)
legend("topright",legend=c("Prior","Posteriors"),col=1:2,bty="n",lty=1,lwd=2)plot(y,L.bayes,ylim=c(-0.1,1.2),pch=16,ylab="95% credibility interval")
points(y,U.bayes,pch=16)
abline(h=0,lty=3)
abline(h=1,lty=3)
for (i in 1:(n+1))
segments(y[i],L.bayes[i],y[i],U.bayes[i])
abline(h=qbeta(0.025,alpha,beta),col=2)
abline(h=qbeta(0.975,alpha,beta),col=2)
legend("topleft",legend=c("Prior","Posterior"),col=2:1,bty="n",lty=1,lwd=2)plot(y,L.bayes,ylim=c(-0.1,1.2),pch=16,ylab="95% intervals")
points(y,U.bayes,pch=16)
abline(h=0,lty=3)
abline(h=1,lty=3)
for (i in 1:(n+1))
segments(y[i],L.bayes[i],y[i],U.bayes[i])
points(y+0.1,L.mle,pch=16,col=2)
points(y+0.1,U.mle,pch=16,col=2)
for (i in 2:n)
segments(y[i]+0.1,L.mle[i],y[i]+0.1,U.mle[i],col=2)
legend("topleft",legend=c("95% confidence interval","95% creditibility interval"),col=1:2,bty="n",lty=1,lwd=2)