---
title: "Bayesian AR(1) model"
author: "Hedibert Freitas Lopes"
date: "1/23/2020"
output:
pdf_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Simulating some AR(1) data
The data $y_1,\ldots,y_n$ are simulated as follows
$$
y_t = 0.8 y_{t-1} + \epsilon_t \qquad \epsilon_t \sim N(0,1.5),
$$
for $t=2,\ldots,n$ and $y_1=0$.

```{r fig.width=12, fig.height=7}
set.seed(1234)
n   = 50
phi = 0.8
sig = 1.5
y   = rep(0,n)
for (t in 2:n)
 y[t] = rnorm(1,phi*y[t-1],sig)
ts.plot(y)
```

# Maximum likelhood inference about $\phi$
Assuming the correct model specification, we want to learn $\phi$, where 
$$
y_t = \phi y_{t-1} + \epsilon_t \qquad \epsilon_t \sim N(0,\sigma^2),
$$
where $\sigma=1.5$, $t=2,\ldots,n$ and $y_1=0$. It is easy to see that
$$
{\hat \phi}_{mle}=\frac{\sum_{t=2}^n y_t y_{t-1}}{\sum_{t=2}^n y_{t-1}^2}
$$
and, therefore, 
$$
{\hat \phi}_{mle}|\phi,\sigma^2 \sim N(\phi,V),
$$
where 
$$
V = \frac{\sigma^2}{\sum_{t=2}^n y_{t-1}^2}
$$

# Bayesian inference about $\phi$
We assume that, a priori, 
$$
\phi \sim N(\phi_0,V_\phi), 
$$
where $\phi_0$ and $V_\phi$ are prior hyperparameters.  It is also easy to verify that
$$
\phi|y_1,\ldots,y_n,\sigma^2 \sim N(\phi_1,V_1),
$$
where
$$
\phi_1 = \xi{\hat \phi}_{mle}+(1-\xi)\phi_0
$$
and
$$
V_1 = \frac{1}{\frac{1}{V}+\frac{1}{V_\phi}}.
$$
where $\xi = V_\phi/(V+V_\phi)$.


```{r fig.width=12, fig.height=7}
phi.mle = sum(y[2:n]*y[1:(n-1)])/sum(y[1:(n-1)]^2)
V       = sig^2/sum(y[1:(n-1)]^2)
phi0    = 0.5
V.phi   = 0.25^2
psi = V.phi/(V+V.phi)
phi1 = psi*phi.mle+(1-psi)*phi0
V1 = 1/(1/V+1/V.phi)

c(phi.mle,phi1)
c(V,V1)

phis = seq(-1,1.5,length=1000)
plot(phis,dnorm(phis,phi.mle,sqrt(V)),type="l",
     xlab=expression(phi),ylab="Density",lwd=2)
lines(phis,dnorm(phis,phi0,sqrt(V.phi)),col=2,lwd=2)
lines(phis,dnorm(phis,phi1,sqrt(V1)),col=4,lwd=2)
legend("topleft",legend=c("Likelihood","Priori","Posterior"),col=c(1,2,4),lty=1,lwd=2)
```

# Computing predictive for various models
We will use Monte Calo integration to compute the 
predictive density of competing models:

* Model 0: AR(1) with conditional known variance
* Model 1: AR(1) with conditional unknown variance 
* Model 2: Student's t AR(1) with conditional unknown scale

MCI approximates
$$
p(y_1,\ldots,y_n|M_i) = \int p(y_1,\ldots,y_n|\theta_i)p(\theta_i|M_i)d\theta_i
$$
by
$$
{\hat p}(y_1,\ldots,y_n|M_i) = \frac{1}{M} \sum_{j=1}^M p(y_1,\ldots,y_n|\theta_i^{(j)}),
$$
where $\left\{\theta_i^{(1)},\ldots,\theta_i^{(M)}\right\}$ are draws from $p(\theta_i|M_i)$.

```{r fig.width=12, fig.height=7}
M = 10000

# Model M0
phis = rnorm(M,phi0,sqrt(V.phi))
parc = rep(0,M)
for (j in 1:M)
 parc[j]=prod(dnorm(y[2:n],phis[j]*y[1:(n-1)],sig))
pred.M0 = mean(parc)

# Model M1
betas   = rnorm(M,phi0,sqrt(V.phi))
sigma2s = 1/rgamma(M,2.5,2.5)
parc = rep(0,M)
for (j in 1:M)
  parc[j]=prod(dnorm(y[2:n],betas[j]*y[1:(n-1)],sqrt(sigma2s[j])))
pred.M1 = mean(parc)

# Model M2:
betas   = rnorm(M,phi0,sqrt(V.phi))
sigma2s = 1/rgamma(M,2.5,2.5)
parc = rep(0,M)
for (j in 1:M)
  parc[j]=prod(dt((y[2:n]-betas[j]*y[1:(n-1)])/sqrt(sigma2s[j]),df=4)/sqrt(sigma2s[j]))
pred.M2 = mean(parc)

# Prior predictives
c(pred.M0,pred.M1,pred.M2)

# Bayes factors
B01 = pred.M0/pred.M1
B02 = pred.M0/pred.M2
B12 = pred.M1/pred.M2
c(B01,B02,B12)

# Posterior model probabilities
post.model = c(pred.M0,pred.M1,pred.M2)
post.model = post.model/sum(post.model)
round(post.model,3)
```