--- title: "Partial ACF and AIC for AR(3) models" author: "Hedibert Freitas Lopes" date: "5/4/2018" output: # word_document: default pdf_document: default #html_document: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Simulating one AR(3) time series We will simulated $n$ observations from a Gaussian AR(3) process: $$ y_t = \phi_0 + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \phi_3 y_{t-3} + \epsilon_t \qquad \epsilon_t \sim N(0,\sigma^2), $$ where $\phi_0=0$, $\phi_1=0.3463$, $\phi_2=0.1770$, $\phi_3=-0.1421$ and $\sigma=0.01$. ```{r fig.width=8, fig.height=3} set.seed(12345) phi = c(0.3463,0.1770,-0.1421) sig = 0.01 n = 500 r = rep(0,n) for (t in 4:n) r[t] = sum(r[(t-1):(t-3)]*phi) + rnorm(1,0,sig) par(mfrow=c(1,2)) plot(r,xlab="Time",ylab="Simulated data",type="b") pacf(r,xlab="Lag",ylab="Partial ACF",main="") ``` ## Simulating $N$ AR(3) time series Let us now simulated $N$ of such AR(3) time series of length $n$. ```{r} set.seed(12345) N = 100 phi = c(0.3463,0.1770,-0.1421) sig = 0.01 n = 500 ordermax = 10 r = rep(0,n) r = matrix(0,n,N) ar3coef = matrix(0,N,3) aic = matrix(0,N,ordermax+1) order = rep(0,N) ind = 0:ordermax for (i in 1:N){ for (t in 4:n) r[t,i] = sum(r[(t-1):(t-3),i]*phi) + rnorm(1,0,sig) fit = ar(r[,i],order.max=3,aic="FALSE") ar3coef[i,] = fit$ar fit = ar(r[,i],order.max=ordermax) aic[i,] = fit$aic order[i] = ind[aic[i,]==min(aic[i,])] } ``` We can now study the sampling behavior of the MLE estimates of $\phi_i$, for $i=1,2,3$. ```{r fig.width=8, fig.height=3} par(mfrow=c(1,3)) plot(density(ar3coef[,1]),xlab=expression(phi[1]),ylab="Sampling density",main="") abline(v=phi[1],col=2) plot(density(ar3coef[,2]),xlab=expression(phi[2]),ylab="Sampling density",main="") abline(v=phi[2],col=2) plot(density(ar3coef[,3]),xlab=expression(phi[3]),ylab="Sampling density",main="") abline(v=phi[3],col=2) ``` As the above graphs show, the sampling behavior of the MLE estimators resamble a Gaussian distribution centered around the true parameters (unbiasedness). We can also investigate the sampling behavior of the AIC. ```{r fig.width=8, fig.height=4} par(mfrow=c(1,2)) plot(0:10,aic[1,],ylab="AIC",type="b",ylim=c(0,50),xlab="Lag") for (i in 2:N) lines(0:10,aic[i,],col=i,type="b") plot(table(order)/N,xlab="Lag",ylab="Relative frequency") ``` As the above graphs reveal, the correct model, i.e. an AR(3) model, gives the lowest AIC, for about 70\% of the simulated datasets. # Your turn You can now try simulating and fitting other AR(3) models, for different AR coeffients, $\phi$s, different variances, $\sigma^2$, different sample sizes, $n$, and different number of replications, $N$, to better understand the behavior of both MLE estimates of $\phi$s and the AIC.