The linkage data -
Section 2 of Tanner and Wong (1987)
Martin A. Tanner and Wing Hung Wong (1987) The Calculation of
Posterior Distributions by Data Augmentation, Journal of the
American Statistical Association, Vol. 82, No. 398, pages 528-540.
URL: https://www.jstor.org/stable/2289457
From a genetic linkage model, it is believed that 197 animals are
distributed multinomially into four categories, ie. \[
y = (y_1,y_2,y_3,y_4)=(125,18,20,34),
\] with cell probabilities specified by \[
\left(\frac{1}{2}+\frac{\theta}{4},\frac{1-\theta}{4},\frac{1-\theta}{4},\frac{\theta}{4}\right).
\] The likelihood is of the form \[
p(y|\theta) \propto (2+\theta)^{y_1}(1-\theta)^{y_2+y_3}\theta^{y_4},
\] To illustrate the data augmentation algorithm, \(y\) is augmented by splitting the first
cell into two cells, once of which having cell probability \(1/2\), the other having cell probability
\(\theta/4\). Then the augmented data
set is given by \[
x=(x_1,x_2,x_3,x_4,x_5),
\] where \(x_1+x_2=125\), \(x_3=y_2\), \(x_4=y_3\), and \(x_5=y_4\). The augmented likelihood is of
the form \[
p(x|\theta) \propto \theta^{x_2+x_5}(1-\theta)^{x_3+x_4}.
\] When the prior of \(\theta\)
is \(U(0,1)\), it follows that \(\theta|x \sim Beta(x_2+x_5+1,x_3+x_4+1)\).
Also, the distribution of \(x_2|\theta \sim
Binomial(y_1,\theta/(\theta+2))\).
Data augmentation
algorithm
For given initial value, \(\theta^{(0)}\), sample iteratively through
the following distributions, for \(j=1,2,\ldots,M\), for large \(M\):
Sample \(x_2^{(j)}\) from \(Bin(y_1,\theta^{(j-1)}/(2+\theta^{(j-1)}))\)
Sample \(\theta^{(j)}\) from
\(Beta(x_2^{(j)}+y_4+1,y_2+y_3+1)\)
The set of draws \(\{(x_2,\theta)^{(j)}\}_{j=1}^M\)
approximates the augmented posterior \(p(x_2,\theta|y)\). In addition, \(\{\theta^{(j)}\}_{j=1}^M\) approximates
\(p(\theta|y)\)
Implementation
Let \(\theta^{(0)}=1/2\), or any
other values in \((0,1)\),
y = c(125,18,20,34)
theta = 0.5
M = 10000
draws = matrix(0,M,2)
for (iter in 1:M){
x2 = rbinom(1,y[1],theta/(2+theta))
theta = rbeta(1,x2+y[4]+1,y[2]+y[3]+1)
draws[iter,] = c(theta,x2)
}
theta.d = draws[,1]
x2.d = draws[,2]
par(mfrow=c(1,3))
ts.plot(theta.d,xlab="iterations",ylab="",main=expression(theta))
acf(theta.d,main="")
hist(theta.d,prob=TRUE,xlab=expression(theta),main="")

quantile(theta.d,c(0.025,0.5,0.975))
## 2.5% 50% 97.5%
## 0.5204466 0.6239384 0.7182968
Posterior
summaries
I1 = mean(theta.d)
I2 = mean((x2.d+y[4]+1)/(x2.d+y[2]+y[3]+y[4]+2))
c(I1,I2)
## [1] 0.6229764 0.6226399
N = 1000
thetas = seq(0,1,length=N)
post = rep(0,N)
for (i in 1:N)
post[i] = mean(dbeta(thetas[i],x2.d+y[4]+1,y[2]+y[3]+1))
hist(theta.d,prob=TRUE,xlab=expression(theta),main="")
lines(density(theta.d),col=1,lwd=2)
lines(thetas,post,col=2,lwd=2)
legend("topleft",legend=c("Kernel density approx.","Raoblackwell approx."),col=1:2,lwd=2,bty="n")

LS0tCnRpdGxlOiAiRGF0YSBhdWdtZW50YXRpb24iCnN1YnRpdGxlOiAiVGhlIGxpbmthZ2UgZGF0YSBleGFtcGxlIgphdXRob3I6ICJIZWRpYmVydCBGcmVpdGFzIExvcGVzIgpkYXRlOiAiMjkvMTAvMjAyNSIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICB0b2M6IHRydWUKICAgIHRvY19kZXB0aDogMwogICAgdG9jX2NvbGxhcHNlZDogdHJ1ZQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQogICAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCiAgcGRmX2RvY3VtZW50OgogICAgdG9jOiB0cnVlCiAgICB0b2NfZGVwdGg6ICczJwotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpCmBgYAoKCiMgVGhlIGxpbmthZ2UgZGF0YSAtIFNlY3Rpb24gMiBvZiBUYW5uZXIgYW5kIFdvbmcgKDE5ODcpCgpNYXJ0aW4gQS4gVGFubmVyIGFuZCAgV2luZyBIdW5nIFdvbmcgKDE5ODcpIFRoZSBDYWxjdWxhdGlvbiBvZiBQb3N0ZXJpb3IgRGlzdHJpYnV0aW9ucyBieSBEYXRhIEF1Z21lbnRhdGlvbiwgKkpvdXJuYWwgb2YgdGhlIEFtZXJpY2FuIFN0YXRpc3RpY2FsIEFzc29jaWF0aW9uKiwgVm9sLiA4MiwgTm8uIDM5OCwgcGFnZXMgNTI4LTU0MC4KVVJMOiBodHRwczovL3d3dy5qc3Rvci5vcmcvc3RhYmxlLzIyODk0NTcgCgoKRnJvbSBhIGdlbmV0aWMgbGlua2FnZSBtb2RlbCwgaXQgaXMgYmVsaWV2ZWQgdGhhdCAxOTcgYW5pbWFscyBhcmUgZGlzdHJpYnV0ZWQgbXVsdGlub21pYWxseSBpbnRvIGZvdXIgY2F0ZWdvcmllcywgaWUuIAokJAp5ID0gKHlfMSx5XzIseV8zLHlfNCk9KDEyNSwxOCwyMCwzNCksCiQkCndpdGggY2VsbCBwcm9iYWJpbGl0aWVzIHNwZWNpZmllZCBieSAKJCQKXGxlZnQoXGZyYWN7MX17Mn0rXGZyYWN7XHRoZXRhfXs0fSxcZnJhY3sxLVx0aGV0YX17NH0sXGZyYWN7MS1cdGhldGF9ezR9LFxmcmFje1x0aGV0YX17NH1ccmlnaHQpLgokJApUaGUgbGlrZWxpaG9vZCBpcyBvZiB0aGUgZm9ybQokJApwKHl8XHRoZXRhKSBccHJvcHRvICgyK1x0aGV0YSlee3lfMX0oMS1cdGhldGEpXnt5XzIreV8zfVx0aGV0YV57eV80fSwKJCQKVG8gaWxsdXN0cmF0ZSB0aGUgZGF0YSBhdWdtZW50YXRpb24gYWxnb3JpdGhtLCAkeSQgaXMgYXVnbWVudGVkIGJ5IHNwbGl0dGluZyB0aGUgZmlyc3QgY2VsbCBpbnRvIHR3byBjZWxscywgb25jZSBvZiB3aGljaCBoYXZpbmcgY2VsbCBwcm9iYWJpbGl0eSAkMS8yJCwgdGhlIG90aGVyIGhhdmluZyBjZWxsIHByb2JhYmlsaXR5ICRcdGhldGEvNCQuICBUaGVuIHRoZSBhdWdtZW50ZWQgZGF0YSBzZXQgaXMgZ2l2ZW4gYnkgCiQkCng9KHhfMSx4XzIseF8zLHhfNCx4XzUpLAokJAp3aGVyZSAkeF8xK3hfMj0xMjUkLCAkeF8zPXlfMiQsICR4XzQ9eV8zJCwgYW5kICR4XzU9eV80JC4gVGhlIGF1Z21lbnRlZCBsaWtlbGlob29kIGlzIG9mIHRoZSBmb3JtCiQkCnAoeHxcdGhldGEpIFxwcm9wdG8gXHRoZXRhXnt4XzIreF81fSgxLVx0aGV0YSlee3hfMyt4XzR9LgokJApXaGVuIHRoZSBwcmlvciBvZiAkXHRoZXRhJCBpcyAkVSgwLDEpJCwgaXQgZm9sbG93cyB0aGF0ICRcdGhldGF8eCBcc2ltIEJldGEoeF8yK3hfNSsxLHhfMyt4XzQrMSkkLiAgQWxzbywgdGhlIGRpc3RyaWJ1dGlvbiBvZiAkeF8yfFx0aGV0YSBcc2ltIEJpbm9taWFsKHlfMSxcdGhldGEvKFx0aGV0YSsyKSkkLgoKIyMgRGF0YSBhdWdtZW50YXRpb24gYWxnb3JpdGhtCgpGb3IgZ2l2ZW4gaW5pdGlhbCB2YWx1ZSwgJFx0aGV0YV57KDApfSQsIHNhbXBsZSBpdGVyYXRpdmVseSB0aHJvdWdoIHRoZSBmb2xsb3dpbmcgZGlzdHJpYnV0aW9ucywgZm9yICRqPTEsMixcbGRvdHMsTSQsIGZvciBsYXJnZSAkTSQ6CgoxKSBTYW1wbGUgJHhfMl57KGopfSQgZnJvbSAkQmluKHlfMSxcdGhldGFeeyhqLTEpfS8oMitcdGhldGFeeyhqLTEpfSkpJAoKMikgU2FtcGxlICRcdGhldGFeeyhqKX0kIGZyb20gJEJldGEoeF8yXnsoail9K3lfNCsxLHlfMit5XzMrMSkkCgpUaGUgc2V0IG9mIGRyYXdzICRceyh4XzIsXHRoZXRhKV57KGopfVx9X3tqPTF9Xk0kIGFwcHJveGltYXRlcyB0aGUgYXVnbWVudGVkIHBvc3RlcmlvciAkcCh4XzIsXHRoZXRhfHkpJC4gIEluIGFkZGl0aW9uLCAkXHtcdGhldGFeeyhqKX1cfV97aj0xfV5NJCBhcHByb3hpbWF0ZXMgJHAoXHRoZXRhfHkpJAoKCiMjIEltcGxlbWVudGF0aW9uCkxldCAkXHRoZXRhXnsoMCl9PTEvMiQsIG9yIGFueSBvdGhlciB2YWx1ZXMgaW4gJCgwLDEpJCwKCmBgYHtyIGZpZy5hbGlnbj0nY2VudGVyJywgZmlnLndpZHRoPTEwLCBmaWcuaGVpZ2h0PTR9CnkgPSBjKDEyNSwxOCwyMCwzNCkKCnRoZXRhID0gMC41Ck0gPSAxMDAwMApkcmF3cyA9IG1hdHJpeCgwLE0sMikKZm9yIChpdGVyIGluIDE6TSl7CiAgeDIgPSByYmlub20oMSx5WzFdLHRoZXRhLygyK3RoZXRhKSkKICB0aGV0YSA9IHJiZXRhKDEseDIreVs0XSsxLHlbMl0reVszXSsxKQogIGRyYXdzW2l0ZXIsXSA9IGModGhldGEseDIpCn0KdGhldGEuZCA9IGRyYXdzWywxXQp4Mi5kID0gZHJhd3NbLDJdCgpwYXIobWZyb3c9YygxLDMpKQp0cy5wbG90KHRoZXRhLmQseGxhYj0iaXRlcmF0aW9ucyIseWxhYj0iIixtYWluPWV4cHJlc3Npb24odGhldGEpKQphY2YodGhldGEuZCxtYWluPSIiKQpoaXN0KHRoZXRhLmQscHJvYj1UUlVFLHhsYWI9ZXhwcmVzc2lvbih0aGV0YSksbWFpbj0iIikKCnF1YW50aWxlKHRoZXRhLmQsYygwLjAyNSwwLjUsMC45NzUpKQpgYGAKCiMgUG9zdGVyaW9yIHN1bW1hcmllcwoKYGBge3J9CkkxID0gbWVhbih0aGV0YS5kKQpJMiA9IG1lYW4oKHgyLmQreVs0XSsxKS8oeDIuZCt5WzJdK3lbM10reVs0XSsyKSkKYyhJMSxJMikKCgpOID0gMTAwMAp0aGV0YXMgPSBzZXEoMCwxLGxlbmd0aD1OKQpwb3N0ID0gcmVwKDAsTikKZm9yIChpIGluIDE6TikKICBwb3N0W2ldID0gbWVhbihkYmV0YSh0aGV0YXNbaV0seDIuZCt5WzRdKzEseVsyXSt5WzNdKzEpKQoKaGlzdCh0aGV0YS5kLHByb2I9VFJVRSx4bGFiPWV4cHJlc3Npb24odGhldGEpLG1haW49IiIpCmxpbmVzKGRlbnNpdHkodGhldGEuZCksY29sPTEsbHdkPTIpCmxpbmVzKHRoZXRhcyxwb3N0LGNvbD0yLGx3ZD0yKQpsZWdlbmQoInRvcGxlZnQiLGxlZ2VuZD1jKCJLZXJuZWwgZGVuc2l0eSBhcHByb3guIiwiUmFvYmxhY2t3ZWxsIGFwcHJveC4iKSxjb2w9MToyLGx3ZD0yLGJ0eT0ibiIpCmBgYA==