Archive

Tag Archives: Fama MacBeth

The previous post summarised some of the results obtained by applying the FB procedure in the context of a 60 month rolling window of fixed size across returns data spanning Jan 1995 to Dec 2010. As promised, this post will address what is essentially the same issue but in the context of the full sample. Instead of using the rolling time series regression estimates as inputs to the subsequent cross section regression for each month in the next year, the full sample approach involves a single time series regression on the basis of the total period and cross sectional regressions in every month of the available data. In other words, we are trading off time variation in beta (rolling windows) in favour of a simplistic approach that uses more data to estimate betas and which assumes constancy in estimated risk exposures over time. The individual estimates of risk premia obtained from the second stage of the FB method are then collated as before to provide a time series of estimates across each month of the available data. I have suppressed the intercept in the cross section regression for the full sample just to see what happens. I probably should have kept it in,but i cannot be bothered to change it right now.

f

[ Full Sample Approach ]

#######################################################################
#Full-Sample FamaMacbeth
#######################################################################</pre>
#store each regression in seperate list element
num.assets <- ncol(test.assets)
ts.reg.list <- list()

for(i in 1:num.assets){
 ts.reg.list[[i]] <- lm(test.assets[,i]~macrofactors.data)
}

#extract estimates
Ext.estim <- RegExtractor(x=ts.reg.list,type='est')
rownames(Ext.estim)<-colnames(test.assets)
colnames(Ext.estim)<-c('intercept',colnames(macrofactors.data))

#extract pvals
Ext.pval <- RegExtractor(x=ts.reg.list,type='pval')
rownames(Ext.pval)<-colnames(test.assets)
colnames(Ext.pval)<-c('intercept',colnames(macrofactors.data))

#extract rsq
Ext.rsq <- RegExtractor(x=ts.reg.list,type='rsq')*100
colnames(Ext.rsq)<-colnames(test.assets)

#Visualise Premia
windows()
rownames(Ext.estim)<-asset.names
melted.data <- melt(Ext.estim)
colnames(melted.data) <- c('Asset','Estimates','Value')
ggplot(melted.data, aes(Estimates, Value, fill=Asset)) +
 geom_bar(stat="identity") +
 facet_wrap(~Asset, nrow=5) +
 coord_flip()+
 theme_bw()

#Visualise p-values
windows()
rownames(Ext.pval)<-asset.names
melted.data <- melt(Ext.pval)
colnames(melted.data) <- c('Asset','PValue','Value')
ggplot(melted.data, aes(PValue, Value, fill=Asset)) +
 geom_bar(stat="identity") +
 facet_wrap(~Asset, nrow=5) +
 coord_flip()+
 theme_bw()

#Visualise Tables
windows()
est.tab <- round(Ext.estim,5)
 est.tab <- apply(est.tab, 2, rev)
 est.tab <- cbind(rev(asset.names),est.tab)
 par(mai=c(0.35,0.15,0.5,0.15))
 TableMaker(row.h=1,est.tab,c('Test Asset ',colnames(Ext.estim)),strip=F,strip.col=c('green','blue'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='blue',header.tcol='white',title=c('Time Series Estimates of parameters'))

windows()
pval.tab <- round(Ext.pval,5)
 pval.tab <- apply(pval.tab, 2, rev)
 pval.tab <- cbind(rev(asset.names),pval.tab)
 par(mai=c(0.35,0.15,0.5,0.15))
 TableMaker(row.h=1,pval.tab,c('Test Asset ',colnames(Ext.pval)),strip=T,strip.col=c('green','red'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='blue',header.tcol='white',title=c('Time Series p-val\n[Significance at the 5% level-GREEN]'))

#Cross section regression at each period t
cross.dep <- t(test.assets)
cross.beta <-Ext.estim[,-1]
rownames(cross.dep)<-asset.names
reg.coll <- list()

for(i in 1:ncol(cross.dep)){
 reg.coll[[i]]=lm(cross.dep[,i]~0+cross.beta)
}

#Extract parameters,t,pvalues,etc
full.num.t <- nrow(test.assets)
full.premia<- RegExtractor(reg.coll,type='est')
full.tval<- RegExtractor(reg.coll,type='tval')
full.pval <- RegExtractor(reg.coll,type='pval')
full.resid <- RegExtractor(reg.coll,type='res')

#Compute individual test
full.mean.premia <- colMeans(full.premia)
full.std.premia <- (apply(full.premia,2,var)/full.num.t)^0.5
full.t.stat <- full.mean.premia/(full.std.premia/(full.num.t^0.5))
full.p.val <- pt(-abs(full.t.stat),df=full.num.t-1)

#Compute joint test
full.num.t <- nrow(test.assets)
full.resid <- RegExtractor(reg.coll,type='res')
full.mean.resid <- rowMeans(full.resid)

alpha.diff1 <- full.resid-full.mean.resid
alpha.diff2 <- t(alpha.diff1)
alpha.sum <- alpha.diff1%*%alpha.diff2
cov.term <- alpha.sum/(full.num.t^2)

full.joint.test <- t(full.mean.resid)%*%ginv(cov.term)%*%full.mean.resid
#Visualise full sample premia
windows()
par(mai=c(0.2,0.50,0.2,0.25))
full.mean.premia<-matrix(full.mean.premia,ncol=numcol(macrofactors))
colnames(full.mean.premia)<-colnames(macrofactors)
layout(matrix(c(1,1,2),nrow=3,ncol=1))
col.scheme <- rainbow(20)[10:13]
plot(ylab='Estimates',xaxt='n',cex.main=0.85,cex.lab=0.75,cex.axis=0.75,main='Risk Premium estimates using full sample betas',ylim=c(min(full.premia),max(full.premia)),full.premia[,1],col='red',type='l')

for(i in 2:length(macrofactors)){
 lines(full.premia[,i],col=col.scheme[i])
}
legend(bty='o',y.intersp=1,title='Macroeconomic factors','topright',fill=c('red',col.scheme,'green','gold'),legend=c('MP - Chg. Industrial Production','UI - Unanticipated Infl','DEI - Chg.Expected Infl','UTS - Unanticipated ret on LT Bonds','UPR - BAA minus AAA yield','95% Significance','90% Significance'),ncol=1,bg='white',cex=0.75)
text(20,max(full.premia),paste("Joint Statistic :",round(full.joint.test,4)),cex=0.75)
text(20,max(full.premia)-0.02,paste("P-value :",round(1-pchisq(full.joint.test,df=24),4)),cex=0.75)

par(mai=c(0.55,0.50,0.2,0.25))
plot(ylab='P-Values',xlab='Dates',cex.main=0.85,cex.lab=0.75,cex.axis=0.75,main='Risk Premium P-values',ylim=c(min(full.pval),max(full.pval)),full.pval[,1],col='red',type='l')
for(i in 2:length(macrofactors)){
 lines(full.pval[,i],col=col.scheme[i])
}
abline(h=0.05,col='green',lwd=2)
abline(h=0.10,col='gold',lwd=2)
<pre>
#####################################################################################

f
I have summarised the time series estimates across the 25 randomly selected stocks using ggplot2 for a pleasant change :

s1

s3I have also extracted the p-values for the estimated risk exposures based on the full sample. It appears that only the MKT (excess market return) factor is consistently significant at the 5% level :

s2

s4As you can see, I have also attached the tabulated values corresponding to the data plotted for both ggplots. The green fields in the second table correspond to significant estimates at the 5% level and corroborate the conclusions reached from the collection of bar charts of the plot above.

Finally I have also plotted the second stage risk premia estimates (collected across time) as follows :

s5

The top plot shows how risk premia change across time for each previously estimated factor exposure. I have also computed the joint test along with its p-value (located in top left corner of plot). The null hypothesis, of pricing errors being jointly equal to 0, cannot be rejected…i think. The bottom plot simply shows the associated p-values and significance at the 5% and 10% levels. When p-values dip below the horizontal line(s) we can conclude statistical significance at the corresponding percent level.

As usual,take these posts with a grain of salt.

In part (b) of the Macroeconomic factor model post, I shall first deal with the Fama/MacBeth methodology in the context of a rolling window of fixed size. I have mentioned previously that this experiment involves a simplified version of the estimation procedure whereby assets are not sorted into equally sized portfolios based on pre-ranking betas, a step in the process originally intended to reduce the error in variables problem. The error in variables problem arises from the use of estimated betas (obtained in the time series regressions) as independent variables in the subsequent cross section regression (that constitutes the second step of the FB-procedure). I have only recently discovered the function rollapply,which automatically deals with rolling windows forward through a dataset, hence the code will be quite a bit different from that in the Fama-MacBeth post from before, where I did account for the error in variables problem by grouping assets into 20 portfolios according to their pre-ranking betas. Before making a distinction between the rolling-and the full sample analysis, it makes sense to first setup the environment, define custom functions and extract requisite data subsets.

g

[ Set up the libraries, custom functions & datasets ]

The usual (and somewhat dated) TableMaker function makes a comeback,allowing us to ‘plot’ tabulated data in a specific way. I have also defined a new custom function called RegExtractor which is used to extract regression estimates,standard errors,t-values,p-values,R-squared and residuals from a list object of regressions.
f

setwd("C:/University of Warwick/R experiments/Asset Pricing")

library(quantmod)
library(tseries)
library(timeSeries)
library(MASS)
library(ggplot2)
library(reshape)

#######################################################################################
#Table Maker
#######################################################################################

TableMaker <- function(row.h,core,header,strip,strip.col,col.cut,alpha,border.col,text.col,header.bcol,header.tcol,title)
{
	rows <- nrow(core)+1
	cols <- ncol(core)
  colours <- adjustcolor(col=strip.col,alpha.f=alpha)
  idx <- 1

	plot.new()
  plot.window(xlim=c(0,cols*11),ylim=c(0,rows*7))
  mtext(title,side=3,cex=0.7)

	for(i in 1:cols){
		for(j in 1:rows){
			if(j==1){
       rect((i-1)*11,rows*7-7,11*i,rows*7,density=NA,col=header.bcol,lty=1,border=border.col)
       text(((i-1)*11+11*i)/2,((rows*7-7)+rows*7)/2,header[i],cex=0.8,col=header.tcol,font=1)
			}
			else if((j>1) && (row.h==0))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=ifelse(strip,ifelse(core[j-1,i]<col.cut,colours[1],colours[2]),'white'),lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
      }
			else if((j>1) && (row.h!=0) && (i==row.h[idx]))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=border.col,lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
			}
			else if((j>1) && (row.h!=0) && (i!=row.h[idx]))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=ifelse(strip,ifelse(core[j-1,i]<col.cut,colours[1],colours[2]),'white'),lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
      }
		}
     ifelse(idx<length(row.h),idx<-idx+1,idx<-length(row.h))

	}
}
#########################################################################################

#########################################################################################
#Extract Regression Values
#########################################################################################
RegExtractor <- function(x,type)
{
	if(type=='est'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,1]))
	}
	else if (type=='stderr'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,2]))
	}
  else if(type=='tval'){
   output <- t(sapply(x,function(v)coef(summary(v))[,3]))
  }
	else if(type=='pval'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,4]))
	}
	else if(type=='rsq'){
	 output <- t(sapply(x,function(v)(summary(v))$r.squared))
	}
	else if(type=='res'){
	 output <- sapply(x,function(v)(resid(v)))
	}
	return(output)
}
#########################################################################################

f

The data used in this experiment involves : [1] The macrofactors defined earlier ; [2] The excess return on the SP500 index ; [3] Excess returns on a random sample of 25 stocks. These datasets are subsetted so they cover a total period spanning Jan-1995 to Dec-2010
k

#########################################################################################
#Data import & handling
#########################################################################################

#Import CSV file(s)
macrofactors<- read.csv('macrofactors.csv',header=T)
load(file='SP500 components.RData')
ff.df <- read.csv('Factor_data.csv',header=T)
ff.tot.data <- ts(data=ff.df,start=c(1926,7),end=c(2012,12),frequency=12)/100

monthly.ret <- prices.vect$monthly.ret
ret.data <- monthly.ret[,sample(25)]
asset.names <- colnames(ret.data)

#Subset data to
sub.c.1 <- c(1995,1)
sub.c.2 <- c(2010,12)
macrofactors.data <-ts(data=macrofactors,start=sub.c.1,end=sub.c.2,frequency=12)
test.assets <-ts(data=ret.data,start=sub.c.1,end=sub.c.2,frequency=12)
rf.ret <- window(ff.tot.data[,4],start=sub.c.1,end=sub.c.2)
exc.mkt <- window(ff.tot.data[,1],start=sub.c.1,end=sub.c.2)
mkt <- exc.mkt
macrofactors.data <- cbind(macrofactors.data,mkt)

test.assets <- test.assets-rf.ret
colnames(macrofactors.data) <- c(colnames(macrofactors),'MKT')
timestamp <- seq(as.Date('1995-01-01'),as.Date('2010-12-31'),by='months')
#########################################################################################

k
Having setup the environment,defined the functions and obtained the necessary data subsets, let’s first apply the FB procedure in the context of a rolling window analysis.

f

[ Rolling Regression Analysis ]

In the context of a rolling analysis, the FB procedure can be summarised as follows:

[1] Run a time series regression of asset (i) on previously defined factors on the basis of a window of size 60 (60 monthly observations are used for regressions) and store the results (estimates,p-values) in list objects before rolling the window of fixed length forward by 1 year.

[2] Run a cross section regression for each month across the 25 assets on estimated betas in step [1]. These regressions are run on the 1 year worth of observations separating the end of the previous window and the start of the next window. 

[3] Save timeseries of premia estimates and residuals from step [2],take mean and test for signficance.

f

###################################################################################
# Rolling Regression
###################################################################################
#timeseries regressions
roll.reg <- list()
pval.reg <- list()
window.end <- NULL

for(i in 1:ncol(test.assets)){
	data.mat=cbind(test.assets[,i],macrofactors.data)
  roll.reg[[i]]=as.matrix(na.remove(rollapply(data.mat,width=60,FUN=function(x) coef(lm(x[,1]~x[,-1])),by.column=F,align='right',by=12)),ncol=7)
  colnames(roll.reg[[i]])=c('Intercept',colnames(macrofactors.data))
	pval.reg[[i]]=matrix(na.remove(rollapply(data.mat,width=60,FUN=function(z) coef(summary(lm(z[,1]~z[,-1])))[,4],by.column=F,align='right',by=12)),ncol=7)
  colnames(pval.reg[[i]])=c('Intercept',colnames(macrofactors.data))
}

#visualise regression windows
window.idx <- as.character(1:12)
window.st <- as.character(timestamp[seq(1,133,12)])
window.end <- as.character(timestamp[seq(60,192,12)])
window.diff <- abs(timestamp[seq(1,133,12)]-timestamp[seq(60,192,12)])/365
window.tot <- cbind(window.idx,window.st,window.end,window.diff)
colnames(window.tot) <- c('Regression window','Start','End','Years of observations')

windows()
windows.tab <- window.tot
		windows.tab <- apply(windows.tab, 2, rev)
		par(mai=c(0.5,0.35,0.5,0.35))
		TableMaker(row.h=1,windows.tab,colnames(window.tot),strip=F,strip.col=c('green','red'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='gold',header.tcol='black',title=c('5 Year Regression Windows For Timeseries Regressions'))

#crosssection regression
cross.betas <-list()
cross.ret <-list()
cross.reg <- list()

ret.idx <- matrix(c(seq(from=61,to=nrow(test.assets),by=12),seq(from=72,to=nrow(test.assets),by=12)),ncol=2,byrow=F)
for(i in 1:nrow(ret.idx)){
	cross.ret[[i]]=t(test.assets[ret.idx[i,1]:ret.idx[i,2],])
}

for(i in 1:nrow(roll.reg[[1]])){
	cross.betas[[i]]=roll.reg[[i]][1,-1]
	for(j in 2:ncol(test.assets)){
		cross.betas[[i]]=rbind(cross.betas[[i]],roll.reg[[j]][i,-1])
  }
}

#Computing t-stats
intercept.ts<-NULL
riskpremia.ts<-NULL
resid.ts<-NULL

for(i in 1:nrow(ret.idx)){
	cross.reg[[i]]=lm(cross.ret[[i]]~cross.betas[[i]])
  intercept.ts=rbind(intercept.ts,as.matrix(coef(cross.reg[[i]])[1,]))
	riskpremia.ts=cbind(riskpremia.ts,coef(cross.reg[[i]])[-1,])
	resid.ts=cbind(resid.ts,resid(cross.reg[[i]]))
}

intercept.mean<-mean(intercept.ts)
riskpremia.mean<-rowMeans(riskpremia.ts)
resid.mean<-rowMeans(resid.ts)

num.t<-nrow(intercept.ts)
intercept.std <- apply(intercept.ts,2,sd)
riskpremia.std <- apply(riskpremia.ts,1,sd)
resid.std <- apply(resid.ts,1,sd)

tstat.intercept <- (intercept.mean/(intercept.std/(num.t^0.5)))
tstat.premia <- (riskpremia.mean/(riskpremia.std/(num.t^0.5)))
tstats.avg <- round(as.matrix(c(tstat.intercept,tstat.premia)),4)
rownames(tstats.avg) <- c('Intercept',colnames(macrofactors.data))
colnames(tstats.avg) <- 'T-Test'

#Visualise
prem <- t(matrix(riskpremia.ts,nrow=6))
prem <- round(cbind(intercept.ts,prem),4)
beg.seq <- seq(1,121,by=12)
end.seq <- seq(12,132,by=12)
lhs <- rev(c('Intercept',colnames(macrofactors.data)))
cols <- rainbow(20)[10:16]

png(file="premia%02d.png",width=550,height=500)
i<-1
for(i in 1:length(beg.seq)){
prem.tab <- cbind(t(prem[(beg.seq[i]:end.seq[i]),]),tstats.avg)
		prem.tab <- apply(prem.tab,2,rev)
    prem.tab <- cbind((lhs),prem.tab)
   	layout(matrix(c(1,1,2),ncol=1))
    par(mai=c(0.30,0.55,0.5,0.3))
    plot(xlim=c(-3,12),ylab='Estimates',cex.main=0.85,cex.lab=0.75,cex.axis=0.75,main=paste('12 month forward crossectional regression estimates\n using data for ',timestamp[ret.idx[i,1]],'to',timestamp[ret.idx[i,2]]),prem[(beg.seq[i]:end.seq[i]),1],type='l',lwd=1.75,col='red')
	  for(j in 2:7){
	  	lines(prem[(beg.seq[i]:end.seq[i]),j],lwd=1.75,col=cols[j])
	  }
	  abline(v=0,col='black',lwd=1)
	  legend(bty='n',y.intersp=1,'topleft',fill=c('red',cols),legend=c('INT - Intercept','MP - Chg. Industrial Production','UI - Unanticipated Infl','DEI - Chg.Expected Infl','UTS - Unanticipated ret on LT Bonds','UPR - BAA minus AAA yield','MKT - Market Returns'),ncol=1,bg='white',cex=0.75)
	  par(mai=c(0.15,0.25,0.10,0.05))
	  TableMaker(row.h=1,prem.tab,c(paste('Window:\n',i),month.name,'T-Test'),strip=F,strip.col=c('green','blue'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='blue',header.tcol='white',title=c(''))
}
dev.off()
shell("convert -delay 10 *.png premia.gif")

k
For convenience I have summarised the regression windows and the associated beginning and end periods using the TableMaker function as follows :

ap1

I have also animated the riskpremia per regression window for each factor (essentially the timeseries of risk premia obtained in step [2]) as well as the T-statistics for averaged values for the entire series. I finally figured out why some of my previous gifs did not animate as expected,apparently one has to make sure that the size of the gif is about 500 and not resize the gif within wordpress (i.e. use full size option when importing it). Although I do wish that the quality of the gif was a bit better…perhaps i am doing something wrong.

premia

The next post will apply the same methodology on the full sample rather than on rolling 60 month windows. From the last column of the table above,it appears that the intercept,UI and DEI premia are statistically significant at the 10% and 5% levels respectively.

The ability of investors to manage a portfolio of assets that are less than perfectly correlated with one another – thus conferring diversification benefits to total investment risks for a given level of portfolio return –  is consistent with the focus of modern financial theory in seeking out pervasive (systematic) influences as the likely source of priced undiversifiable risk. The general conclusion of theory is that capital markets should only award investors with an additional return component if they are exposed to a set of risk factors whose influence on total risk cannot be mitigated by holding a diversified set of assets. On balance, the stridency with which financial theories expound their conclusions is only matched by their relative timidity in terms of identifying the exogenous influences,the economic state variables that underlie comovements of asset prices. The objective of this post is to apply a simplified version of the Chen,Roll&Ross (1986) methodology (effectively a Fama MacBeth procedure) to a random sample of 25 stocks on the SP500 index. The authors begin by identifying macroeconomic variables that should systematically affect stock returns and proceed to test whether surprises (innovations) in these factors represent risks that are rewarded by the market.

To ascertain whether the identified economic state variables are related to the underlying factors that explain asset pricing,a version of the Fama-MacBeth methodology shall be employed.

[The model and its data]

The macroeconomic factor model is specified as follows :

rgwith independent variables on the right hand side being factor surprises or innovations defined according to

bv

and f being macroeconomic variables of interest. According to the original authors, the following economic variables should systematically affect stock returns : [1] Spread between long/short rates , [2] Change in expected Inflation , [3] Unexpected Inflation , [4] Industrial Production , [5] Spread between high and low grade bonds.

While I have neither access to databases that contain these datasets nor patience to scrape individual variables separately from the web (if that is even possible), I do have access to Google and the requisite patience to find entire data sets (usually) provided by university professors on their course websites. This time it is from Hong Kong of all places. I have summarised the variable names used throughout the code for convenience :

tax

In terms of test assets used on the left hand side, i simply took a random sample of 25 stocks from the SP500 index and compute their returns. I have previously saved price/return data for stocks on the SP500 in an RData file,hence I shall simply take my sample from there instead of downloading new data.

[Estimation procedure – Fama/MacBeth]

[1] Time series regression of stock returns on unanticipated changes in macroeconomic variables to yield asset exposures to risk factors.

rg

[2] In each month, a cross section regression of stock returns on previously estimated asset exposures to a time series of risk premia and pricing errors.

z0

[3] Take mean of these time series and test for significance.

xx

[4] Joint Test

z1

with

z2

The above procedure can be replicated for rolling windows as well, in which case 5 years worth of data are used in the initial time series regression and the cross section regressions are applied to each of the 12 months in the subsequent year before the window of fixed size is rolled forward and the process repeated. As with all posts on this blog, these are mere experiments, very likely erroneous in some way and not meant for replication.

Fama and MacBeth (1973) suggest an alternative procedure for running cross-sectional regressions. The first step of the procedure is identical to the time series regression performed earlier, where excess asset returns were regressed against excess market returns. The estimates of  beta are then fed as inputs into a cross section regression which constitutes the second step of the FMB process. The difference between the two pass regression examined in the previous blog posts and the FMB procedure delineated here is that the latter estimates a cross-sectional regression at each time period while the former estimates a single cross-sectional regression with sample averages.

This post focuses on the application of the FMB methodology to the previous dataset.

[1] Estimate a beta for each of the 351 stocks on the basis of the first 48 months of observations in the data set.

[2] Rank order all of the stock betas, and form 20 portfolios with assets appearing in descending order in terms of their systematic risk.

[3]Estimate a portfolio beta for each of the 20 portfolios by regressing portfolio monthly returns against excess market returns on the basis of the next 60 months worth of observations.

[4] Estimate the expost SML for each month of the next 48 months worth of data.

[5] Estimate the SML accounting for nonlinearity and residual variances for the same time period as step 4.

[6] Repeat steps 1-5 as often as possible across adjacent regression windows. Unfortunately in this case,due to data restrictions, I only have 2 regression windows before running out of data.

[7] Compute the mean value for each of the coefficients estimated in steps 5 and 6.

The code for the FMB procedure follows

#Fama MacBeth
FB <- list()
 
  alpha.list <- NULL
  beta.list <- NULL
 
  nl.alpha.list <- NULL
  nl.beta.list <- NULL
  nl.betasq.list <- NULL
 
  rv.alpha.list <- NULL
  rv.beta.list <- NULL
  rv.betasq.list <- NULL
  rv.res.list <- NULL
 
  FB$windows1.loc.beg <- c(1,49)
  FB$windows1.loc.end <- c(48,96)
 
  FB$windows2.loc.beg <- c(49,97)
  FB$windows2.loc.end <- c(108,156)
 
  FB$windows3.loc.beg <- c(109,157)
  FB$windows3.loc.end <- c(156,204)
 
  FB$port.beg <- c(seq(1,289,by=18),seq(307,337,by=15))
  FB$port.end <- c(seq(18,306,by=18),seq(321,351,by=15))
 
  FB$num.windows <- 2
	
  FB$windows <- list()
 
        for(i in 1:FB$num.windows){
         FB$windows[[i]]<-list()
         FB$windows[[i]]$alpha.list <- NULL
	 FB$windows[[i]]$beta.list <- NULL
	 FB$windows[[i]]$portfolios <- NULL
	 FB$windows[[i]]$portfolios.mean <- NULL			
         FB$windows[[i]]$portfolio.mat <- NULL
         FB$windows[[i]]$port.fit <- NULL
			
	for(j in 1:ncol(stock.ret)){
	 FB$windows[[i]]$fit[[j]]<-lm(ex.ret[FB$windows1.loc.beg[i]:FB$windows1.loc.end[i],j]~exm.ret[FB$windows1.loc.beg[i]:FB$windows1.loc.end[i]]) 
	 FB$windows[[i]]$fit[[j]]$alpha <- coef(FB$windows[[i]]$fit[[j]])[1]
         FB$windows[[i]]$fit[[j]]$beta <- coef(FB$windows[[i]]$fit[[j]])[2]
			  
         FB$windows[[i]]$alpha.list <- rbind(FB$windows[[i]]$alpha.list,FB$windows[[i]]$fit[[j]]$alpha)
	 FB$windows[[i]]$beta.list <- rbind(FB$windows[[i]]$beta.list,FB$windows[[i]]$fit[[j]]$beta)
				
	 FB$windows[[i]]$fit[[j]]$alpha.p <- summary(FB$windows[[i]]$fit[[j]])[[4]][7]
	 FB$windows[[i]]$fit[[j]]$beta.p <- summary(FB$windows[[i]]$fit[[j]])[[4]][8]
			  
	 FB$windows[[i]]$fit[[j]]$rsquared <- summary(FB$windows[[i]]$fit[[j]])[[9]]
	}
			
	rownames(FB$windows[[i]]$alpha.list) <- colnames(monthly.ret)
	rownames(FB$windows[[i]]$beta.list) <- colnames(monthly.ret)
	colnames(FB$windows[[i]]$alpha.list) <- 'alphas'
	colnames(FB$windows[[i]]$beta.list) <- 'betas'
			
	FB$windows[[i]]$matrix <- cbind(FB$windows[[i]]$alpha.list,FB$windows[[i]]$beta.list,t(stock.ret[(FB$windows2.loc.beg[i]):(FB$windows2.loc.end[i]),1:351]))
	FB$windows[[i]]$matrix.sorted <- FB$windows[[i]]$matrix[order(FB$windows[[i]]$matrix[,2],decreasing=TRUE),]
			
	for(k in 1:20){
	 FB$windows[[i]]$portfolios[[k]] <- FB$windows[[i]]$matrix.sorted[FB$port.beg[k]:FB$port.end[k],]
	 FB$windows[[i]]$portfolios.mean[[k]] <- matrix(colMeans(FB$windows[[i]]$portfolios[[k]][,3:62]),ncol=1)
	 FB$windows[[i]]$portfolio.mat <- cbind(FB$windows[[i]]$portfolio.mat,FB$windows[[i]]$portfolios.mean[[k]])
	}
 
	colnames(FB$windows[[i]]$portfolio.mat) <- paste('Portfolio ',1:20)
      	FB$windows[[i]]$port.fit <- lm(FB$windows[[i]]$portfolio.mat~exm.ret[FB$windows2.loc.beg[i]:FB$windows2.loc.end[i],1])
        FB$windows[[i]]$port.betas <- matrix(coef(FB$windows[[1]]$port.fit)[2,],ncol=1)
	FB$windows[[i]]$matrix.sml <- cbind(FB$windows[[i]]$alpha.list,FB$windows[[i]]$beta.list,t(stock.ret[(FB$windows3.loc.beg[i]):(FB$windows3.loc.end[i]),1:351]))
	FB$windows[[i]]$matrix.sorted.sml <- FB$windows[[i]]$matrix.sml[order(FB$windows[[i]]$matrix.sml[,2],decreasing=TRUE),]
			
	for(k in 1:20){
	 FB$windows[[i]]$portfolios.sml[[k]] <- FB$windows[[i]]$matrix.sorted.sml[FB$port.beg[k]:FB$port.end[k],]
	 FB$windows[[i]]$portfolios.mean.sml[[k]] <- matrix(colMeans(FB$windows[[i]]$portfolios.sml[[k]][,3:50]),ncol=1)
	 FB$windows[[i]]$portfolio.mat.sml <- (cbind(FB$windows[[i]]$portfolio.mat.sml,FB$windows[[i]]$portfolios.mean.sml[[k]]))
	}	
	FB$windows[[i]]$portfolio.mat.sml<-t(FB$windows[[i]]$portfolio.mat.sml)
        FB$windows[[i]]$portfolio.sml.fit <- lm(FB$windows[[i]]$portfolio.mat.sml~FB$windows[[i]]$port.betas)
	sq <- ((FB$windows[[i]]$port.betas)^2)
	resd <- apply(resid(FB$windows[[i]]$port.fit),2,sd)^2
			
        FB$windows[[i]]$portfolio.nonl.fit <- lm(FB$windows[[i]]$portfolio.mat.sml~FB$windows[[i]]$port.betas+sq)
        FB$windows[[i]]$portfolio.rv.fit <- lm(FB$windows[[i]]$portfolio.mat.sml~FB$windows[[i]]$port.betas+sq+resd)
 
	alpha.list <- rbind(alpha.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.sml.fit)[1,]))
        alpha.mean <- mean(alpha.list)
			
	beta.list <- rbind(beta.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.sml.fit)[2,]))
        beta.mean <- mean(beta.list)
			
	nl.alpha.list <- rbind(nl.alpha.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.nonl.fit)[1,]))
        nl.alpha.mean <- mean(nl.alpha.list)
		
	nl.beta.list <- rbind(nl.beta.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.nonl.fit)[2,]))
        nl.beta.mean <- mean(nl.beta.list)
 
	nl.betasq.list <- rbind(nl.betasq.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.nonl.fit)[3,]))
        nl.betasq.mean <- mean(nl.betasq.list)
			
	rv.alpha.list <- rbind(rv.alpha.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.rv.fit)[1,]))
        rv.alpha.mean <- mean(rv.alpha.list)
		
	rv.beta.list <- rbind(rv.beta.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.rv.fit)[2,]))
        rv.beta.mean <- mean(rv.beta.list)
 
	rv.betasq.list <- rbind(rv.betasq.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.rv.fit)[3,]))
        rv.betasq.mean <- mean(rv.betasq.list)
 
        rv.res.list <- rbind(rv.res.list,matrix(ncol=1,coef(FB$windows[[i]]$portfolio.rv.fit)[4,]))
        rv.res.mean <- mean(rv.res.list)
}

Clearly the BJS and FMB procedures have been hardcoded for the current example rather than generalised for any arbitrary dataset.

To visualise some of the results

#Visualise FB
tab1 tab2 annot c1 c2 tab
port.names for(k in 1:17){
  port.names   port.names }

for(k in 18:20){
  port.names  port.names }

tab.port rn tab.port annot.port
windows()
layout(matrix(c(1,1,1,2,2),nrow=5,ncol=1))

par(mai=c(0,0,0.2,0))
TableMaker(row.h=c(1),apply(tab.port,2,rev),annot.port,strip=F,strip.col=c('red','green'),col.cut=0,alpha=0.6,border.col='lightgrey',text.col='black',header.bcol='gold',header.tcol='black',title='Portfolios in window 1')

par(mai=c(0.55,0.55,0.2,0.2))
colours plot(xlab='months',ylab='portfolio mean returns',main='Mean portfolio returns for next 5 years',cex.main=0.85,cex.lab=0.8,cex.axis=0.8,as.ts(FB$windows[[1]]$portfolios.mean[[1]]),col=colours[1])
for(i in 2:20){
	lines(FB$windows[[1]]$portfolios.mean[[i]],col=colours[i])
}
legend('top',fill=colours,legend=paste('Portfolio',1:20),ncol=10,bg='white',bty='n',cex=0.6)

annot r1 tab.com1 tab.com2 tab.com2
windows()
par(mai=c(0,0,0,0))
TableMaker(row.h=c(1),apply(tab.com1,2,rev),annot,strip=F,strip.col=c('red','green'),col.cut=0,alpha=0.6,border.col='white',text.col='black',header.bcol='light blue',header.tcol='black',title='Portfolios in window 1')

windows()
par(mai=c(0,0,0,0))
TableMaker(row.h=c(1),apply(tab.com2,2,rev),annot,strip=F,strip.col=c('red','green'),col.cut=0,alpha=0.6,border.col='white',text.col='black',header.bcol='light blue',header.tcol='black',title='Portfolios in window 1')

To get a sense of how the 20 beta-sorted portfolios look like, i have plotted their composition in the following table. Rows signify portfolios, with the smaller numbered portfolios corresponding to collections of assets with higher betas and larger numbered portfolios corresponding to collections of assets with lower betas. Since we have 351 assets in our dataset, we cannot evenly divide them into 20 equally sized portfolios. As a consequence, the first 17 portfolios have 18 assets while the last three portfolios have 15 assets in order to,quite arbitrarily, cover the total of 351 assets. The data plotted here is for the first regression window,covering the first 48 months of observations. The next and final regression window would cover the next 48 months of data (not shown here). The line chart at the bottom of the plot window tracks the average return of each of the 20 portfolios over the next 60 months in the dataset. This corresponds to step 3 of the FMB process outlined above.

nm

To get a sense of the coefficient estimates from steps 4-6, the following table (splitted into two plots due to plotting space problems) summarises estimated parameters for 3 models across two regression windows that total 96 months of data (48 months of observations per window).

mnh

The last rows of the table reflect column averages and associated p-values, which taken together should inform CAPM consistency. The only time we can reject the null hypothesis is with respect to residual variances in the last model such that for all other figures, statistical insignificance is to be inferred. Alphas, being statistically indistinguishable from 0 at traditional levels of significance is a point towards the CAPM.Betas,being insignificant do not help in explaining portfolio returns,constitute a departure from theory . Nonlinearity does not seem to be an issue but residual variance appears to matter greatly.On balance the CAPM does not seem to hold in terms of its implications for beta and residual variances but appears somewhat sensible for alphas and nonlinearity.