Gaussian model
Here we will perform maximum likelihood and Bayesian inference for a
one-parameter example. We will assume that \(x_1,\ldots,x_n\) for a independent and
identically distributed (iid) sample from a Gaussian distribution with
mean \(\theta\) and variance \(\sigma^2\), denoted by \(x_i\theta \sim N(\theta,\sigma^2)\), for
\(i=1,2,\ldots,n\) andknown \(\sigma^2\). In other words, only \(\theta\) is unknown in this model
structure. When dealing with linear models, as Gaussian linear
regressions or state-space models, it is useful to rewrite the model as
\[
x_i = \theta + \epsilon_i \qquad \epsilon_i \sim N(0,\sigma^2).
\] Such parametrization, allows us to naturally extend the model
to allow for regressors, such as \(\theta_i=z_i'\beta\), in linear
regression models, such as \(\theta_t =
g(z_i,\beta)\), in nonlinear regression models, or time-varying
parameters, such as \(\theta_t=\theta_{t-1}+u_t\), in state-space
modeling.
Maximum likelihood
inference
The likelihood function is easily derived from the joint density of
the data given \(\theta\) \[
p(x_1,\ldots,x_n | \theta) = \prod_{i=1}^n p(x_i|\theta),
\] such that \[
L(\theta;\mbox{data}) = \prod_{i=1}^n
(2\pi\sigma^2)^{-1/2}\exp\left\{-\frac{0.5}{\sigma^2}(x_i-\theta)^2\right\}
\propto \exp\left\{-\frac{0.5}{\sigma^2}(n\theta^2 - 2n\bar{x}\theta)
\right\},
\] which resembles a Gaussian distribution with mean \(\bar{x}\) and variance \(\sigma^2/n\). In fact, the maximum
likelihood estimator (MLE) of \(\theta\) is \[
{\widehat \theta}_{MLE} = \bar{x},
\] such that \({\widehat
\theta}_{MLE}|\theta \sim N(\theta,\sigma^2/n)\).
x = c(-2.33,-0.52,-0.51,-0.38,0.25,0.77,0.79,0.80,1.35,1.40)
n = length(x)
sigma = 1
# ML estimation
theta.mle = mean(x)
L = theta.mle + qnorm(0.025)*sigma/sqrt(n)
U = theta.mle + qnorm(0.975)*sigma/sqrt(n)
N = 200
thetas = seq(-2,2,length=N)
like = rep(0,N)
for (i in 1:N)
like[i] = prod(dnorm(x,thetas[i],sigma))
par(mfrow=c(1,1))
plot(thetas,like,type="l",ylab="Likelihood",xlab=expression(theta))
abline(v=theta.mle,col=2,lwd=2)
abline(v=L,lty=2,col=2)
abline(v=U,lty=2,col=2)
title(paste("MLE = ",round(theta.mle,3),
"\n95% CI = (",round(L,3),",",round(U,3),")",sep=""))

Bayesian
inference
We will use here a conjugate prior for \(\theta\) \[
\theta \sim N(\theta_0,\tau_0^2),
\] for known hyperparameters \(\theta_0\) and \(\tau_0\). The Gaussian prior conjugates
with the Gaussian likelihood function, which leads to a Gaussian
posterior, ie. \[\begin{eqnarray*}
p(\theta|\mbox{data}) &\propto& p(\theta)L(\theta;\mbox{data)}\\
&\propto& \exp\left\{-\frac{0.5}{\tau_0^2}(\theta^2-2\theta
\theta_0)\right\}
\exp\left\{-\frac{0.5}{\sigma^2}(n\theta^2-2\theta n \bar{x})\right\}\\
&\propto& \exp\left\{-\frac{0.5}{\tau_1^2}(\theta^2-2\theta
\theta_1)\right\},
\end{eqnarray*}\] where \[
\tau_1^{-2} = \tau_0^{-2} + n \sigma^{-2} \ \ \ \mbox{and} \ \ \
\theta_1 = \tau_1^2(\theta_0/\tau_0^2 + n \bar{x}/\sigma^2),
\] and \(n \bar{x} =
\sum_{i=1}^n\). Notice that the posterior precision, \(\tau_1^{-2}\), is the sum of the prior
precision, \(\tau_0^{-2}\), and the
likelihood precision, \(\sigma^2/n\).
As \(n\) gets much larger than \(\tau_0^{-2}\), the posterior precision goes
to infinity and the posterior distribution of \(\theta\) concentrates around \({\bar x}\). A non-informative, improper
prior assumes that \(\theta_0=0\) and
\(\tau_0^{-1}=0\). In this case, \(\theta_1={\bar x}\) and \(\tau_1^2=\sigma^2/n\) and both ML and
Bayesian inference coincide.
Conjugacy, consequently, permits that all posterior summaries are
obtained in closed form, such as the posterior mean and posterior
variance, as well as posterior quantiles which are useful to construct,
say, a 95% credibility interval for \(\theta\).
theta0 = 0
tau02 = 2
tau12 = 1/(1/tau02+n/sigma^2)
theta1 = round(tau12*(theta0/tau02+sum(x)/sigma^2),3)
post = dnorm(thetas,theta1,sqrt(tau12))
L.b = round(theta1 + qnorm(0.025)*sqrt(tau12),3)
U.b = round(theta1 + qnorm(0.975)*sqrt(tau12),3)
plot(thetas,post,type="l",ylab="Posterior density",xlab=expression(theta))
abline(v=theta1,col=2,lwd=2)
abline(v=L.b,lty=2,col=2)
abline(v=U.b,lty=2,col=2)
title(paste("Posterior mean = ",theta1,"\n95% Credibility interval = (",L.b,",",U.b,")",
sep=""))

Comparison
plot(thetas,post,type="l",ylab="Density",xlab=expression(theta))
abline(v=theta1)
abline(v=L.b,lty=2)
abline(v=U.b,lty=2)
lines(thetas,like/max(like)*max(post),col=2)
abline(v=theta.mle,col=2)
abline(v=L,lty=2,col=2)
abline(v=U,lty=2,col=2)
legend("topleft",legend=c("Likelihood","Posterior"),col=1:2,lty=1,bty="n")

Student’s t model
Here, we will learn that posterior inference is virtually never
easily obtained in closed form. Despite the coherent and clean
combination of information provided by Bayesian thinking, computation is
and will always be an important ingredient to allow for accurate
posterior approximation in more complex problems.
Here, we will simply assume that the data might allow for fatter
tails than the Gaussian. More precisely, we will replace the Gaussian
model with a Student’s \(t\) model with
\(\nu\) degrees of freedom, for known
\(\nu\). More precisely, \(x_1,\ldots,x_n\) are still iid, but now
\(t_\nu(\theta,\sigma_0^2)\), for known
\(\sigma_0^2=\frac{\nu-2}{\nu}\sigma^2\), for
\(\nu>2\), \[
p(x_i | \theta, \sigma_0^2, \nu) =
\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right)
(\nu\pi\sigma_0^2)^{1/2}} \left[ 1 + \frac{1}{\nu} \frac{(x_i -
\theta)^2}{\sigma_0^2} \right]^{-\frac{\nu+1}{2}}.
\]
It is easy to see that \[
E(x_i|\theta)=\theta \ \ \ \mbox{and} \ \
\ V(x_i|\theta)=\frac{\nu}{\nu-2}\sigma_0^2=\sigma^2,
\] such that both models, Gaussian and Student’s \(t\), assume the data has the same mean and
the same variance, conditionally on \(\theta\).
The likelihood function is now written as \[
L(\theta;\mbox{data}) = \prod_{i=1}^n p(x_i | \theta, \sigma_0^2, \nu)
\propto
\prod_{i=1}^n \left[ 1 + \frac{1}{\nu} \frac{(x_i -
\theta)^2}{\sigma_0^2} \right]^{-\frac{\nu+1}{2}},
\] which does not conjugate with \(p(\theta) \equiv N(\theta_0,\tau_0^2)\).
Consequently, posterior inference is performed with the assistance of
Monte Carlo integration and simulation.
Sampling importance
sampling (SIR)
Here conjugacy is lost and we will a Monte Carlo simulation scheme to
approximate Bayesian posterior inference. More precisely, we will use a
SIR algorithm to produce draws from \(p(\theta|\mbox{data})\). More precisely,
the algorithm works as follows:
Sample (large) \(M\) draws from
the prior \(p(\theta)\) \[
\{\widetilde{\theta}^{(1)},\ldots,\widetilde{\theta}^{(M)}\};
\]
Evalute the weights \(\omega^{(i)}
\propto L(\widetilde{\theta}^{(i)}|\mbox{data})\);
Resample \(N \leq M\) from the
discrete set \(\{\widetilde{\theta}^{(1)},\ldots,\widetilde{\theta}^{(M)}\}\)
with probabilities \(\{\omega^{(1)},\ldots,\omega^{(M)}\}\).
It can be shown that the final sampled set, say \[
\{\theta^{(1)},\ldots,\theta^{(N)}\},
\] approximates a sample from \(p(\theta|\mbox{data})\). More details about
Monte Carlo integration, MC simulation, SIR and other Monte Carlo
schemes, can be found in my course notes here.
nu = 4
sigma0 = sqrt((nu-2)/nu)*sigma
M = 100000
# Step 1: Sampling from the prior
theta.draws = rnorm(M,theta0,sqrt(tau02))
# Step 2: Computing resampling weights
w = rep(0,M)
for (i in 1:M)
w[i] = sum(dt((x-theta.draws[i])/sigma0,df=nu,log=TRUE))
w = exp(w-max(w))
# Step 3: resampling
ind = sample(1:M,replace=TRUE,size=M/2,prob=w)
theta.post = theta.draws[ind]
# Posterior quantiles and density plot
L.t = quantile(theta.post,0.025)
U.t = quantile(theta.post,0.975)
plot(density(theta.post),ylab="Posterior density",
xlab=expression(theta),main="",lwd=2,ylim=c(0,1.4))
lines(thetas,post,col=2,lwd=2)
segments(L.b,0.0,U.b,0.0,lwd=2,col=2,lty=2)
segments(L.t,0.04,U.t,0.04,lwd=2,lty=2)
points(L.b,0.0,pch=16,col=2)
points(U.b,0.0,pch=16,col=2)
points(L.t,0.04,pch=16)
points(U.t,0.04,pch=16)
legend("topleft",legend=c("Gaussian","Student's t(4)"),col=2:1,lty=1,bty="n",lwd=2)
legend("topright",legend=c("Data",x),bty="n")
title(paste("Sample mean=",round(theta.mle,2),"\n Sample stdev=",
round(sqrt(var(x)),2),sep=""))

Playing around with
\(\nu\)
In the exercise below, we vary \(\nu\) from \(3\) to \(30\) and compare the 95% credibility
intervals. As the figure below reveals, as \(\nu\) increases the Student’s \(t\) converges to the Gaussian distribution
and the credibility intervals become quite similar. Only for small \(\nu\) there is some departure from
Gaussianity.
M = 100000
nus = 3:30
nnu = length(nus)
quants = matrix(0,nnu,3)
theta.draws = rnorm(M,theta0,sqrt(tau02))
for (j in 1:nnu){
nu = nus[j]
sigma0 = sqrt((nu-2)/nu)*sigma
w = rep(0,M)
for (i in 1:M)
w[i] = sum(dt((x-theta.draws[i])/sigma0,df=nu,log=TRUE))
w = exp(w-max(w))
ind = sample(1:M,replace=TRUE,size=M/2,prob=w)
theta.post = theta.draws[ind]
quants[j,] = quantile(theta.post,c(0.025,0.5,0.975))
}
plot(nus,quants[,1],pch=16,type="l",ylab="95% interval",xlab=expression(nu),ylim=c(-0.5,1))
abline(h=L.b,col=2,lwd=2)
abline(h=U.b,col=2,lwd=2)
abline(h=theta1,col=2,lwd=2)
lines(nus,quants[,1],lwd=2)
lines(nus,quants[,2],lwd=2)
lines(nus,quants[,3],lwd=2)
legend("topright",legend=c("Gaussian","Student's t"),col=2:1,lty=1,bty="n",lwd=2)

---
title: "Gaussian vs Student's t data"
subtitle: "Closed-form inference vs MC-based inference"
author: "Hedibert Freitas Lopes"
date: "`r Sys.Date()`"
output:
  html_document:
    theme: paper
    highlight: pygments
    toc: true
    toc_depth: 3
    toc_collapsed: true
    toc_float: true
    code_download: true
    number_sections: true
---


# Gaussian model
Here we will perform maximum likelihood and Bayesian inference for a one-parameter example.
We will assume that $x_1,\ldots,x_n$ for a independent and identically distributed (iid) sample from a Gaussian distribution with mean $\theta$ and variance $\sigma^2$, denoted by $x_i\theta \sim N(\theta,\sigma^2)$, for $i=1,2,\ldots,n$ andknown $\sigma^2$.  In other words, only $\theta$ is unknown in this model structure.   When dealing with linear models, as Gaussian linear regressions or state-space models, it is useful to rewrite the model as
$$
x_i = \theta + \epsilon_i \qquad \epsilon_i \sim N(0,\sigma^2).
$$
Such parametrization, allows us to naturally extend the model to allow for regressors, such as $\theta_i=z_i'\beta$, in linear regression models, such as $\theta_t = g(z_i,\beta)$, in nonlinear regression models, or time-varying parameters, such as $\theta_t=\theta_{t-1}+u_t$, in state-space modeling.

## Maximum likelihood inference

The likelihood function is easily derived from the joint density of the data given $\theta$
$$
p(x_1,\ldots,x_n | \theta) = \prod_{i=1}^n p(x_i|\theta),
$$
such that
$$
L(\theta;\mbox{data}) = \prod_{i=1}^n (2\pi\sigma^2)^{-1/2}\exp\left\{-\frac{0.5}{\sigma^2}(x_i-\theta)^2\right\} 
\propto \exp\left\{-\frac{0.5}{\sigma^2}(n\theta^2 - 2n\bar{x}\theta) \right\},
$$
which resembles a Gaussian distribution with mean $\bar{x}$ and variance $\sigma^2/n$.  In fact, the maximum likelihood estimator (MLE) of $\theta$ is 
$$
{\widehat \theta}_{MLE} = \bar{x},
$$
such that ${\widehat \theta}_{MLE}|\theta \sim N(\theta,\sigma^2/n)$.

```{r fig.align='center', fig.width=10, fig.height=6}
x = c(-2.33,-0.52,-0.51,-0.38,0.25,0.77,0.79,0.80,1.35,1.40)
n = length(x)
sigma     = 1

# ML estimation
theta.mle = mean(x)
L         = theta.mle + qnorm(0.025)*sigma/sqrt(n)
U         = theta.mle + qnorm(0.975)*sigma/sqrt(n)

N      = 200
thetas = seq(-2,2,length=N)
like   = rep(0,N)
for (i in 1:N)
  like[i] = prod(dnorm(x,thetas[i],sigma))

par(mfrow=c(1,1))
plot(thetas,like,type="l",ylab="Likelihood",xlab=expression(theta))
abline(v=theta.mle,col=2,lwd=2)
abline(v=L,lty=2,col=2)
abline(v=U,lty=2,col=2)
title(paste("MLE = ",round(theta.mle,3),
      "\n95% CI = (",round(L,3),",",round(U,3),")",sep=""))
```


## Bayesian inference

We will use here a conjugate prior for $\theta$
$$
\theta \sim N(\theta_0,\tau_0^2),
$$
for known hyperparameters $\theta_0$ and $\tau_0$.  The Gaussian prior conjugates with the Gaussian likelihood function, which leads to a Gaussian posterior, ie.
\begin{eqnarray*}
p(\theta|\mbox{data}) &\propto& p(\theta)L(\theta;\mbox{data)}\\
&\propto& \exp\left\{-\frac{0.5}{\tau_0^2}(\theta^2-2\theta \theta_0)\right\}
\exp\left\{-\frac{0.5}{\sigma^2}(n\theta^2-2\theta n \bar{x})\right\}\\
&\propto& \exp\left\{-\frac{0.5}{\tau_1^2}(\theta^2-2\theta \theta_1)\right\},
\end{eqnarray*}
where 
$$
\tau_1^{-2} = \tau_0^{-2} + n \sigma^{-2} \ \ \ \mbox{and} \ \ \ 
\theta_1 = \tau_1^2(\theta_0/\tau_0^2 + n \bar{x}/\sigma^2),
$$
and $n \bar{x} = \sum_{i=1}^n$.  Notice that the posterior precision, $\tau_1^{-2}$, is the sum of the prior precision, $\tau_0^{-2}$, and the likelihood precision, $\sigma^2/n$. As $n$ gets much larger than $\tau_0^{-2}$, the posterior precision goes to infinity and the posterior distribution of $\theta$ concentrates around ${\bar x}$.  A non-informative, improper prior assumes that $\theta_0=0$ and $\tau_0^{-1}=0$.  In this case, $\theta_1={\bar x}$ and $\tau_1^2=\sigma^2/n$ and both ML and Bayesian inference coincide.

Conjugacy, consequently, permits that all posterior summaries are obtained in closed form, such as the posterior mean and posterior variance, as well as posterior quantiles which are useful to construct, say, a 95\% credibility interval for $\theta$.

```{r fig.align='center', fig.width=10, fig.height=6}
theta0 = 0
tau02  = 2
tau12  = 1/(1/tau02+n/sigma^2)
theta1 = round(tau12*(theta0/tau02+sum(x)/sigma^2),3)
post   = dnorm(thetas,theta1,sqrt(tau12))
L.b    = round(theta1 + qnorm(0.025)*sqrt(tau12),3)
U.b    = round(theta1 + qnorm(0.975)*sqrt(tau12),3)

plot(thetas,post,type="l",ylab="Posterior density",xlab=expression(theta))
abline(v=theta1,col=2,lwd=2)
abline(v=L.b,lty=2,col=2)
abline(v=U.b,lty=2,col=2)
title(paste("Posterior mean = ",theta1,"\n95% Credibility interval = (",L.b,",",U.b,")",
            sep=""))
```

## Comparison

```{r fig.align='center', fig.width=10, fig.height=6}
plot(thetas,post,type="l",ylab="Density",xlab=expression(theta))
abline(v=theta1)
abline(v=L.b,lty=2)
abline(v=U.b,lty=2)
lines(thetas,like/max(like)*max(post),col=2)
abline(v=theta.mle,col=2)
abline(v=L,lty=2,col=2)
abline(v=U,lty=2,col=2)
legend("topleft",legend=c("Likelihood","Posterior"),col=1:2,lty=1,bty="n")
```

# Student's t model

Here, we will learn that posterior inference is virtually never easily obtained in closed form.  Despite the coherent and clean combination of information provided by Bayesian thinking, computation is and will always be an important ingredient to allow for accurate posterior approximation in more complex problems.

Here, we will simply assume that the data might allow for fatter tails than the Gaussian.  More precisely, we will replace the Gaussian model with a Student's $t$ model with $\nu$ degrees of freedom, for known $\nu$.  More precisely, $x_1,\ldots,x_n$ are still iid, but now $t_\nu(\theta,\sigma_0^2)$, for known $\sigma_0^2=\frac{\nu-2}{\nu}\sigma^2$, for $\nu>2$,
$$
p(x_i | \theta, \sigma_0^2, \nu) = \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right) (\nu\pi\sigma_0^2)^{1/2}} \left[ 1 + \frac{1}{\nu} \frac{(x_i - \theta)^2}{\sigma_0^2} \right]^{-\frac{\nu+1}{2}}.
$$

It is easy to see that
$$
E(x_i|\theta)=\theta \ \ \ \mbox{and} \ \ \  V(x_i|\theta)=\frac{\nu}{\nu-2}\sigma_0^2=\sigma^2,
$$
such that both models, Gaussian and Student's $t$, assume the data has the same mean and the same variance, conditionally on $\theta$.

The likelihood function is now written as
$$
L(\theta;\mbox{data}) = \prod_{i=1}^n p(x_i | \theta, \sigma_0^2, \nu) \propto
\prod_{i=1}^n \left[ 1 + \frac{1}{\nu} \frac{(x_i - \theta)^2}{\sigma_0^2} \right]^{-\frac{\nu+1}{2}},
$$
which does not conjugate with $p(\theta) \equiv N(\theta_0,\tau_0^2)$.  Consequently, posterior inference is performed with the assistance of Monte Carlo integration and simulation.

## Sampling importance sampling (SIR)

Here conjugacy is lost and we will a Monte Carlo simulation scheme to approximate Bayesian posterior inference.  More precisely, we will use a SIR algorithm to produce draws from $p(\theta|\mbox{data})$.  More precisely, the algorithm works as follows:

1) Sample (large) $M$ draws from the prior $p(\theta)$
$$
\{\widetilde{\theta}^{(1)},\ldots,\widetilde{\theta}^{(M)}\};
$$

2) Evalute the weights $\omega^{(i)} \propto L(\widetilde{\theta}^{(i)}|\mbox{data})$;

3) Resample $N \leq M$ from the discrete set $\{\widetilde{\theta}^{(1)},\ldots,\widetilde{\theta}^{(M)}\}$  with probabilities $\{\omega^{(1)},\ldots,\omega^{(M)}\}$.

It can be shown that the final sampled set, say 
$$
\{\theta^{(1)},\ldots,\theta^{(N)}\},
$$
approximates a sample from $p(\theta|\mbox{data})$.  More details about Monte Carlo integration, MC simulation, SIR and other Monte Carlo schemes, can be found in my course notes [here](https://hedibert.org/wp-content/uploads/2020/01/aprendizagembayesiana-bayesiancomputation.pdf).

```{r fig.align='center', fig.width=10, fig.height=6}
nu     = 4
sigma0 = sqrt((nu-2)/nu)*sigma
M      = 100000

# Step 1: Sampling from the prior
theta.draws = rnorm(M,theta0,sqrt(tau02))

# Step 2: Computing resampling weights
w = rep(0,M)
for (i in 1:M)
  w[i] = sum(dt((x-theta.draws[i])/sigma0,df=nu,log=TRUE))
w = exp(w-max(w))

# Step 3: resampling
ind = sample(1:M,replace=TRUE,size=M/2,prob=w)
theta.post = theta.draws[ind]

# Posterior quantiles and density plot
L.t = quantile(theta.post,0.025)
U.t = quantile(theta.post,0.975)

plot(density(theta.post),ylab="Posterior density",
     xlab=expression(theta),main="",lwd=2,ylim=c(0,1.4))
lines(thetas,post,col=2,lwd=2)
segments(L.b,0.0,U.b,0.0,lwd=2,col=2,lty=2)
segments(L.t,0.04,U.t,0.04,lwd=2,lty=2)
points(L.b,0.0,pch=16,col=2)
points(U.b,0.0,pch=16,col=2)
points(L.t,0.04,pch=16)
points(U.t,0.04,pch=16)
legend("topleft",legend=c("Gaussian","Student's t(4)"),col=2:1,lty=1,bty="n",lwd=2)
legend("topright",legend=c("Data",x),bty="n")
title(paste("Sample mean=",round(theta.mle,2),"\n Sample stdev=",
            round(sqrt(var(x)),2),sep=""))
```

## Playing around with $\nu$
In the exercise below, we vary $\nu$ from $3$ to $30$ and compare the 95\% credibility intervals.  As the figure below reveals, as $\nu$ increases the Student's $t$ converges to the Gaussian distribution and the credibility intervals become quite similar.  Only for small $\nu$ there is some departure from Gaussianity.

```{r fig.align='center', fig.width=10, fig.height=6}
M   = 100000
nus = 3:30
nnu = length(nus)
quants = matrix(0,nnu,3)
theta.draws = rnorm(M,theta0,sqrt(tau02))
for (j in 1:nnu){
  nu = nus[j]
  sigma0 = sqrt((nu-2)/nu)*sigma
  w = rep(0,M)
  for (i in 1:M)
    w[i] = sum(dt((x-theta.draws[i])/sigma0,df=nu,log=TRUE))
  w = exp(w-max(w))
  ind = sample(1:M,replace=TRUE,size=M/2,prob=w)
  theta.post = theta.draws[ind]
  quants[j,] = quantile(theta.post,c(0.025,0.5,0.975))
}

plot(nus,quants[,1],pch=16,type="l",ylab="95% interval",xlab=expression(nu),ylim=c(-0.5,1))
abline(h=L.b,col=2,lwd=2)
abline(h=U.b,col=2,lwd=2)
abline(h=theta1,col=2,lwd=2)
lines(nus,quants[,1],lwd=2)
lines(nus,quants[,2],lwd=2)
lines(nus,quants[,3],lwd=2)
legend("topright",legend=c("Gaussian","Student's t"),col=2:1,lty=1,bty="n",lwd=2)
```

