Model: \(x|\theta \sim N(\theta,\sigma^2)\)
Prior 1: \(\theta \sim N(\theta_0,\tau_0^2)\)
Prior 2: \(\theta \sim t_{\nu_0}(\theta_0,\tau_0^2)\)
Comparison

Model: \(x|\theta \sim N(\theta,\sigma^2)\)

sigma = 40

Prior 1: \(\theta \sim N(\theta_0,\tau_0^2)\)

theta0 = 800
tau0   = 80

Mgrid      = 200
thetas     = seq(400,1200,length=Mgrid)
den.prior1 = dnorm(thetas,theta0,tau0)

plot(thetas,den.prior1,xlab="x or theta",ylab="Density",type="l",ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ N(",theta0,",",round(tau0^2,3),")",sep=""))

Predictive: \(x \sim N(\theta_0,\tau_0^2+\sigma^2)\)

We are lucky! The prior and the likelihood are both Gaussian and conjugate, i.e. \[ p(x) = \int_{-\infty}^{\infty} p_N(x|\theta,\sigma^2)p_N(\theta_0,\tau_0^2)d\theta= p_N(x|\theta_0,\tau_0^2+\sigma^2), \] so, the predictive is also Gaussian \(x \sim N(\theta_0,\tau_0^2+\sigma^2)\), which in this particular examples looks like the prior with the variance inflated by the quantity \(\sigma^2\).

mean  = theta0
stdev = sqrt(tau0^2+sigma^2)
xs     = seq(400,1200,length=Mgrid)
den.pred1 = dnorm(xs,mean,stdev)

plot(thetas,den.prior1,xlab="x or theta",ylab="Density",type="l",ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ N(",theta0,",",round(tau0^2,3),")",sep=""))
lines(xs,den.pred1,col=2,lwd=3)

Observed measurement

Once the data is observed, say \(x_{obs}\), we can compute quantities such as \[ p(x_{obs}) = p_N(x_{obs},\theta_0,\tau_0^2+\sigma^2) \] or \[ Pr(x>x_{obs}) = \int_{x_{obs}}^{\infty} p_N(x|\theta_0,\tau_0^2+\sigma^2)d\theta \]

xobs = 975

p1x = dnorm(xobs,mean,stdev)
p1x

## [1] 0.0006578067

tail1x = 1-pnorm(xobs,mean,stdev)
tail1x

## [1] 0.02519964

Posterior

pi    = sigma^2/(sigma^2+tau0^2)
stdev = sqrt(pi)*tau0
mean  = pi*theta0+(1-pi)*xobs
den.post1 = dnorm(xs,mean,stdev)

plot(thetas,den.prior1,xlab="x or theta",ylab="Density",type="l",ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ N(",theta0,",",round(tau0^2,3),")",sep=""))
points(xobs,0,pch=16,col=4,cex=2)
lines(xs,den.pred1,col=2,lwd=3)
lines(thetas,den.post1,col=6,lwd=2)

legend("topright",legend=c("p(theta)","p(x)",paste("p(theta|x=",xobs,")",sep="")),
col=c(1,2,6),lwd=3,bty="n")

legend("topleft",legend=c(
"Prior and likelihood conjugate, so all",
"derivations are done in closed form.",
"In general, approximate derivations are needed"))

legend("left",legend=c(
paste("p(x=",xobs,")=",round(p1x,6),sep=""),
paste("Pr(x>",xobs,")=",round(tail1x,6),sep="")))

Prior 2: \(\theta \sim t_{\nu_0}(\theta_0,\tau_0^2)\)

This prior has the same mean and the same variance as the previous one: \[\begin{eqnarray*} E(\theta) &=& \theta_0 \qquad \nu_0>1\\ V(\theta) &=& \frac{\nu_0}{\nu_0-2} \tau_0^2 \qquad \nu_0>2. \end{eqnarray*}\]

nu0    = 4
tau0   = tau0*sqrt((nu0-2)/nu0)
den.prior2 = dt((thetas-theta0)/tau0,df=nu0)/tau0

plot(thetas,den.prior2,xlab="x or theta",ylab="Density",type="l",
     ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ t(",theta0,",",round(tau0^2,3),",",nu0,")",sep=""))

Predictive density (Monte Carlo integration)

The above prior seems more realistic since it has heavier tails. However, this flexibility comes at a considerably large price: no close form solution to \[ p(x) = \int p_n(x|\theta,\sigma^2)p_t(\theta|\theta_0,\tau_0^2,\nu_0)d\theta \ \ \ \mbox{or} \ \ \ p(\theta|x) = \frac{p(x|\theta)p(\theta)}{p(x)}. \] First, we will approximate \(p(x)\) by \[ {\widehat p}(x) = \frac{1}{M} \sum_{i=1}^M p_N(x|\theta^{(i)},\sigma^2), \qquad x \in R, \] where \(\{\theta^{(1)},\ldots,\theta^{(M)}\}\) are i.i.d. draws from \(t_{\nu_0}(\theta_0,\tau_0^2)\).

set.seed(12345)
Mdraw      = 10000
theta.draw = theta0 + tau0*rt(Mdraw,df=nu0)
den.pred2 = rep(0,Mgrid)
for (i in 1:Mgrid)
  den.pred2[i] = mean(dnorm(xs[i],theta.draw,sigma))


plot(thetas,den.prior2,xlab="x or theta",ylab="Density",type="l",
     ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ t(",theta0,",",round(tau0^2,3),",",nu0,")",sep=""))
lines(xs,den.pred2,col=2,lwd=3)
points(xobs,0,pch=16,col=4,cex=2)

Computing \(p(x=x_{obs})\) and \(Pr(x>x_{obs})\) via MC integration

From the above MC approximation \[ {\widehat p}(x_{obs}) = \frac{1}{M} \sum_{i=1}^M p_N(x_{obs}|\theta^{(i)},\sigma^2). \]

p2x = mean(dnorm(xobs,theta.draw,sigma))
p2x

## [1] 0.0004648978

Similary, we will approximate \[ Pr(x>x_{obs}) = \int_{x_{obs}}^{\infty} p_N(x|\theta,\sigma^2)d\theta \] is approximated (via MCI) by \[ {\widehat Pr}(x>x_{obs}) = \frac{1}{M} \sum_{i=1}^M I(x^{(i)}>x_{obs}) \] where, for each \(\theta^{(i)}\), \(x^{(i)}\) is a draw from \(N(\theta^{(i)},\sigma^2)\).

x.draw = rnorm(Mdraw,theta.draw,sigma)
plot(x.draw,theta.draw,xlab="x",ylab=expression(theta))
abline(v=xobs,col=2,lwd=2)
abline(h=xobs,col=2,lwd=2)

tail2x = mean(x.draw>xobs)
tail2x

## [1] 0.0249

Posterior

Since we already approximated \(p(x_{obs})\) by \({\hat p}(x_{obs})\), by Monte Carlo integration, we can evaluate the posterior for all values of \(-\infty < \theta < \infty\): \[ p(\theta|x_{obs}) = \frac{p_N(x_{obs}|\theta,\sigma^2)p_t(\theta|\theta_0,\tau_0^2,\nu_0)}{{\hat p}(x_{obs})} \]

plot(thetas,den.prior2,xlab="x or theta",ylab="Density",type="l",
     ylim=c(0,0.015),lwd=3)
title(paste("Model: x|theta ~ N(theta,",round(sigma^2,3),")\n Prior: theta ~ t(",theta0,",",round(tau0^2,3),",",nu0,")",sep=""))
lines(xs,den.pred2,col=2,lwd=3)
points(xobs,0,pch=16,col=4,cex=2)
lines(thetas,den.prior2*dnorm(xobs,thetas,sigma)/p2x,col=6,lwd=2)
legend("topright",legend=c("p(theta)","p(x)",paste("p(theta|x=",xobs,")",sep="")),
col=c(1,2,6),lwd=3,bty="n")

legend("left",legend=c(
paste("p(x=",xobs,")=",round(p2x,6),sep=""),
paste("Pr(x>",xobs,")=",round(tail2x,6),sep="")))

Comparison

One way to compare competing model/prior setups is via the Bayes Factor: \[ BF = \frac{{\hat p}(x_{obs}|\sigma^2,\theta_0,\tau_0^2)}{{\hat p}(x_{obs}|\sigma^2,\theta_0,\tau_0^2,\nu_0)}. \] The Bayes factor is commonly known as generalized likelihood ratio. The higher the Bayes factor, the higher is the evidence provided by the data (\(x_{obs}\) here!) towards the model in the numerator (the Gaussian model with the Gaussian prior here!).

BF = p1x/p2x
BF

## [1] 1.414949

plot(thetas,den.prior1,xlab="x or theta",ylab="Density",type="l",ylim=c(0,0.015),lwd=2)
lines(thetas,den.prior2,lwd=2,lty=2)
lines(xs,den.pred1,col=2,lwd=2)
lines(xs,den.pred2,col=2,lwd=2,lty=2)
lines(thetas,den.post1,col=6,lwd=2)
lines(thetas,den.prior2*dnorm(xobs,thetas,sigma)/p2x,col=6,lwd=2,lty=2)
legend("topright",legend=c("Gaussian prior","p(theta)","p(x)",paste("p(theta|x=",xobs,")",sep="")),
       col=c(0,1,2,6),lwd=3,bty="n")
legend("topleft",legend=c("Student's t prior","p(theta)","p(x)",
                          paste("p(theta|x=",xobs,")",sep="")),
       col=c(0,1,2,6),lwd=3,lty=2,bty="n")
points(xobs,0,pch=16,col=4,cex=2)
legend("left",legend=c(
paste("p(x=",xobs,")=",round(p2x,6),sep=""),
paste("Pr(x>",xobs,")=",round(tail2x,6),sep="")))
legend("right",legend=c(
paste("p(x=",xobs,")=",round(p1x,6),sep=""),
paste("Pr(x>",xobs,")=",round(tail1x,6),sep="")))


title(paste("Bayes factor=p(x|Gaussian)/p(x|Student's t)=",round(BF,3),sep=""))

Two physicists MC integration

Hedibert Freitas Lopes

5/9/2021

Model: \(x|\theta \sim N(\theta,\sigma^2)\)

Prior 1: \(\theta \sim N(\theta_0,\tau_0^2)\)

Predictive: \(x \sim N(\theta_0,\tau_0^2+\sigma^2)\)

Observed measurement

Posterior

Prior 2: \(\theta \sim t_{\nu_0}(\theta_0,\tau_0^2)\)

Predictive density (Monte Carlo integration)

Computing \(p(x=x_{obs})\) and \(Pr(x>x_{obs})\) via MC integration

Posterior

Comparison