The variable to predict is the monthly growth rate of US industrial production, and the dataset consists of 130 possible predictors, including various monthly macroeconomic indicators, such as measures of output, income, consumption, orders, surveys, labor market variables, house prices, consumer and producer prices, money, credit and asset prices. The sample ranges from February 1960 to December 2014, and all the data have been transformed to obtain stationarity, as in the work of Stock and Watson.
\(y\): the monthly growth rate of US industrial production.
\(X\): monthly macroeconomic indicators, such as measures of output, income, consumption, orders, surveys, labor market variables, house prices, consumer and producer prices, money, credit and asset prices.
Period: February 1960 to December 2014 (about 660 observations)
Full description of X: See appendix B of Stock and Watson (2002b), pages 157-161.
References:
Stock and Watson (2002a) Forecasting Using Principal Components from a Large Number of Predictors, JASA, 97, 147–162. https://scholar.harvard.edu/files/stock/files/forecasting_using_principal_components_from_a_large_number_of_predictors.pdf
Stock and Watson (2002b) Macroeconomic Forecasting Using Diffusion Indexes, JBES, 20, 147–162. https://scholar.harvard.edu/files/stock/files/macroeconomic_forecasting_using_diffusion_indexes.pdf
Giannone, Lenza ad Primiceri (2020) Economic predictions with big data: the illusion of sparsity. https://faculty.wcas.northwestern.edu/~gep575/illusion4-2.pdf
Fava and Lopes (2020) The illusion of the illusion of sparsty: an exercise in prior sensitivity. https://arxiv.org/abs/2009.14296
library("bayeslm")
filename = "https://hedibert.org/wp-content/uploads/2021/03/stockwatson2002-data.txt"
macrodata = read.table(filename,header=FALSE)
k = ncol(macrodata)-1
y = macrodata[,1]
X = as.matrix(macrodata[,2:(k+1)])
n = nrow(X)
dim(X)
## [1] 659 130
fit.ols = lm(y~X-1)
summary(fit.ols)
##
## Call:
## lm(formula = y ~ X - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9224 -0.3948 0.0186 0.4279 2.9102
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## XV2 0.0003296 0.0784303 0.004 0.996649
## XV3 -0.0748243 0.0698419 -1.071 0.284506
## XV4 0.0510148 0.0522478 0.976 0.329312
## XV5 0.1303911 0.0621847 2.097 0.036482 *
## XV6 -0.0752049 0.0556389 -1.352 0.177061
## XV7 0.4439833 0.5729826 0.775 0.438767
## XV8 0.0824967 0.4051686 0.204 0.838736
## XV9 -0.1714310 0.3041942 -0.564 0.573294
## XV10 0.7845590 0.4465061 1.757 0.079478 .
## XV11 -0.5097401 0.2846165 -1.791 0.073869 .
## XV12 -0.5692420 0.2576484 -2.209 0.027576 *
## XV13 0.0590565 0.1192957 0.495 0.620775
## XV14 -0.2261347 0.3378677 -0.669 0.503596
## XV15 0.0672032 0.1357625 0.495 0.620802
## XV16 0.0170097 0.0780816 0.218 0.827634
## XV17 0.8933346 0.4306541 2.074 0.038528 *
## XV18 -0.0016022 0.0555055 -0.029 0.976983
## XV19 0.0300147 0.0354725 0.846 0.397857
## XV20 0.0291581 0.1920536 0.152 0.879385
## XV21 -1.1541357 0.3158563 -3.654 0.000284 ***
## XV22 -0.0463663 0.0638730 -0.726 0.468213
## XV23 -0.0165637 0.0862872 -0.192 0.847847
## XV24 0.0166249 0.2345343 0.071 0.943516
## XV25 -0.0025336 0.2609214 -0.010 0.992256
## XV26 0.0555815 0.1420061 0.391 0.695658
## XV27 0.0787404 0.0512527 1.536 0.125058
## XV28 -0.0792625 0.0669311 -1.184 0.236850
## XV29 -0.0526238 0.0514123 -1.024 0.306508
## XV30 0.0931659 0.2165381 0.430 0.667188
## XV31 -0.1223388 0.1531938 -0.799 0.424888
## XV32 -0.0737371 0.1359880 -0.542 0.587887
## XV33 -0.0772511 0.0413481 -1.868 0.062271 .
## XV34 -0.3257963 0.4611338 -0.707 0.480181
## XV35 0.0197387 0.4427482 0.045 0.964457
## XV36 0.1245017 0.0734396 1.695 0.090608 .
## XV37 0.1122901 0.1753559 0.640 0.522219
## XV38 -4.5120836 2.9797545 -1.514 0.130560
## XV39 3.6768907 2.4856744 1.479 0.139673
## XV40 1.4575165 0.8106561 1.798 0.072756 .
## XV41 0.1604419 0.2963569 0.541 0.588473
## XV42 0.0745660 0.1014828 0.735 0.462808
## XV43 0.0597908 0.0600656 0.995 0.319985
## XV44 0.0697782 0.0713253 0.978 0.328368
## XV45 -0.0822017 0.0529358 -1.553 0.121056
## XV46 0.0091953 0.0496071 0.185 0.853016
## XV47 0.1914305 0.1919833 0.997 0.319162
## XV48 -0.0213217 0.0390319 -0.546 0.585115
## XV49 -0.4595203 0.2090294 -2.198 0.028356 *
## XV50 -0.1541241 0.1467067 -1.051 0.293941
## XV51 2.2246672 1.3427179 1.657 0.098145 .
## XV52 -0.4533709 0.2265101 -2.002 0.045844 *
## XV53 -0.6410294 0.3334704 -1.922 0.055105 .
## XV54 -0.7578373 0.6043409 -1.254 0.210400
## XV55 -0.7237723 0.4070520 -1.778 0.075965 .
## XV56 -3.0493687 1.3518344 -2.256 0.024495 *
## XV57 0.5961090 0.2435195 2.448 0.014693 *
## XV58 0.6488050 0.3056621 2.123 0.034249 *
## XV59 1.2935089 0.6247248 2.071 0.038888 *
## XV60 1.0719533 0.4575892 2.343 0.019519 *
## XV61 0.4360295 0.5511585 0.791 0.429232
## XV62 0.0804641 0.2107074 0.382 0.702707
## XV63 0.0514561 0.1520367 0.338 0.735162
## XV64 -0.1490610 0.0934711 -1.595 0.111370
## XV65 -0.0430592 0.0470526 -0.915 0.360540
## XV66 0.1086395 0.0555799 1.955 0.051150 .
## XV67 -0.0676248 0.0604207 -1.119 0.263549
## XV68 0.1235487 0.0740774 1.668 0.095941 .
## XV69 0.0098264 0.0431925 0.228 0.820121
## XV70 0.0279949 0.0602745 0.464 0.642512
## XV71 0.0495261 0.0567116 0.873 0.382897
## XV72 -0.0046692 0.0440370 -0.106 0.915599
## XV73 0.1189118 0.0473395 2.512 0.012305 *
## XV74 0.0372065 0.0326995 1.138 0.255706
## XV75 0.0200959 0.0355078 0.566 0.571662
## XV76 -0.0690243 0.0346182 -1.994 0.046679 *
## XV77 0.1229494 0.0477078 2.577 0.010232 *
## XV78 -0.1592904 0.0585432 -2.721 0.006725 **
## XV79 0.1691765 0.2781863 0.608 0.543355
## XV80 -0.0491224 0.2650672 -0.185 0.853049
## XV81 0.1262854 0.0996144 1.268 0.205448
## XV82 -0.0401429 0.0643365 -0.624 0.532928
## XV83 0.0076629 0.0766363 0.100 0.920389
## XV84 0.1658422 0.1104483 1.502 0.133813
## XV85 -0.4816826 0.1649020 -2.921 0.003638 **
## XV86 0.4785546 0.2515630 1.902 0.057671 .
## XV87 -0.0681240 0.2091598 -0.326 0.744778
## XV88 -0.1753167 0.1764665 -0.993 0.320928
## XV89 0.0604099 0.1559507 0.387 0.698641
## XV90 0.2970371 0.1039912 2.856 0.004454 **
## XV91 -0.2511460 0.0847305 -2.964 0.003173 **
## XV92 -0.1265381 0.0884063 -1.431 0.152928
## XV93 0.5029510 0.2072896 2.426 0.015586 *
## XV94 -0.2978482 0.3421181 -0.871 0.384367
## XV95 0.1573286 0.2422786 0.649 0.516381
## XV96 -0.4917361 0.4441839 -1.107 0.268773
## XV97 0.1713850 0.5031460 0.341 0.733520
## XV98 0.6657157 0.4637054 1.436 0.151695
## XV99 -0.3338755 0.3530717 -0.946 0.344769
## XV100 0.0583770 0.0486044 1.201 0.230264
## XV101 -0.0342445 0.0407658 -0.840 0.401271
## XV102 0.0196541 0.0454268 0.433 0.665443
## XV103 -0.0313258 0.0402908 -0.777 0.437215
## XV104 0.0215974 0.1185297 0.182 0.855487
## XV105 0.0309690 0.1193782 0.259 0.795413
## XV106 0.0269397 0.0551817 0.488 0.625611
## XV107 -0.0521872 0.0432823 -1.206 0.228456
## XV108 0.0107983 0.0361107 0.299 0.765033
## XV109 -0.0084769 0.0354586 -0.239 0.811148
## XV110 -0.0850127 0.0622626 -1.365 0.172711
## XV111 0.0942480 0.0804058 1.172 0.241663
## XV112 0.0208755 0.0377386 0.553 0.580387
## XV113 0.0298770 0.0831262 0.359 0.719426
## XV114 0.0388840 0.0352849 1.102 0.270962
## XV115 -0.0512062 0.1106006 -0.463 0.643567
## XV116 0.0236428 0.0383706 0.616 0.538048
## XV117 -0.0134025 0.0421644 -0.318 0.750715
## XV118 -0.0472513 0.0626232 -0.755 0.450864
## XV119 -0.0154219 0.0599635 -0.257 0.797134
## XV120 -0.0332338 0.0795316 -0.418 0.676212
## XV121 0.0506601 0.2591295 0.196 0.845076
## XV122 0.0282014 0.0676167 0.417 0.676791
## XV123 -0.0008363 0.2586277 -0.003 0.997421
## XV124 0.0409017 0.1365702 0.299 0.764682
## XV125 -0.0675549 0.0753642 -0.896 0.370458
## XV126 0.1880013 0.0518512 3.626 0.000316 ***
## XV127 0.1199962 0.0701268 1.711 0.087643 .
## XV128 -0.1192568 0.0575490 -2.072 0.038724 *
## XV129 -0.0078032 0.0372667 -0.209 0.834226
## XV130 -0.0169749 0.0436969 -0.388 0.697825
## XV131 0.0255971 0.0341872 0.749 0.454349
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7642 on 529 degrees of freedom
## Multiple R-squared: 0.5305, Adjusted R-squared: 0.4151
## F-statistic: 4.597 on 130 and 529 DF, p-value: < 2.2e-16
beta.ols = fit.ols$coef
sig.ols = summary(fit.ols)$sigma
se = sig.ols*sqrt(diag(solve(t(X)%*%X)))
L = beta.ols+qnorm(0.025)*se
U = beta.ols+qnorm(0.975)*se
plot(beta.ols,pch=16,xlab="Regressor",ylab="Coefficient",main="OLS estimation",ylim=range(L,U),cex=0.5)
vars.ols = NULL
for (i in 1:k){
if (L[i]<0 & U[i]>0){
segments(i,L[i],i,U[i],col=2)
}else{
segments(i,L[i],i,U[i],lwd=2)
text(i,5,i,cex=0.5)
vars.ols = c(vars.ols,i)
}
}
abline(h=0,lty=2)
## [1] 4 11 16 20 48 51 55 56 57 58 59 72 75 76 77 84 89 90 92
## [20] 125 127
fit.bayes = bayeslm(y,X,prior="laplace",icept=FALSE,N=5000,burnin=1000,verb=FALSE)
## laplace prior
## fixed running time 0.0160671
## sampling time 0.679016
qbeta = t(apply(fit.bayes$beta,2,quantile,c(0.025,0.5,0.975)))
plot(qbeta[,2],pch=16,xlab="Regressor",ylab="Coefficient",main="",ylim=range(qbeta),cex=0.5)
title("Bayesian estimation\n Laplace prior")
vars.bayes = NULL
for (i in 1:k){
if (qbeta[i,1]<0 & qbeta[i,3]>0){
segments(i,qbeta[i,1],i,qbeta[i,3],col=2)
}else{
segments(i,qbeta[i,1],i,qbeta[i,3],lwd=2)
text(i,5,i,cex=0.5)
vars.bayes = c(vars.bayes,i)
}
}
abline(h=0,lty=2)
vars.bayes
## [1] 39 61 109 125
yhat.ols = X%*%beta.ols
yhat.bayes = X%*%qbeta[,2]
MSE.ols = mean((y-yhat.ols)^2)
MSE.bayes = mean((y-yhat.bayes)^2)
MAE.ols = mean(abs(y-yhat.ols))
MAE.bayes = mean(abs(y-yhat.bayes))
tab = rbind(c(MSE.ols,MSE.bayes),c(MAE.ols,MAE.bayes))
rownames(tab) = c("MSE","MAE")
colnames(tab) = c("OLS","BAYES")
tab
## OLS BAYES
## MSE 0.4688303 0.5849677
## MAE 0.5139756 0.5488302
train = sort(sample(1:n,size=n/2))
Xtrain = X[train,]
Xtest = X[-train,]
ytrain = y[train]
ytest = y[-train]
fit.ols = lm(ytrain~Xtrain-1)
beta.ols = fit.ols$coef
fit.bayes = bayeslm(ytrain,Xtrain,prior="laplace",icept=FALSE,N=5000,burnin=1000,verb=FALSE)
beta.bayes = apply(fit.bayes$beta,2,median)
yhat.ols = Xtest%*%beta.ols
yhat.bayes = Xtest%*%qbeta[,2]
MSE.ols = mean((ytest-yhat.ols)^2)
MSE.bayes = mean((ytest-yhat.bayes)^2)
MAE.ols = mean(abs(ytest-yhat.ols))
MAE.bayes = mean(abs(ytest-yhat.bayes))
tab = rbind(c(MSE.ols,MSE.bayes),c(MAE.ols,MAE.bayes))
rownames(tab) = c("MSE","MAE")
colnames(tab) = c("OLS","BAYES")
tab
## OLS BAYES
## MSE 1.0577053 0.5496248
## MAE 0.7545287 0.5502646
Repeat the above out-of-sample exercise for 100 replications. Also, consider the reduced models derived by only retaining the significant variables according to the OLS fit and the Bayesian fit, i.e.
OLS variables: 4 11 16 20 48 51 55 56 57 58 59 72 75 76 77 84 89 90 92 125 127
Bayes variables: 32 39 61 109 125
Notice that you have 4 models to compare:
Full model, OLS fit
Full model, Bayesian fit
Reduced model, variables chosen via OLS
Reduced model, variables chosen via Bayes