Archive

Tag Archives: time serires

According to classic market efficiency,stock prices (which are not predictable) follow a random walk and hence financial returns, which are basically percentage price changes, are similarly unpredictable. If this view of the world is true,then one would expect to see b=0 and R-squared=0 for any variable x(t) in the simple regression : R(t+1) = a+bx(t)+e(t+1). Intuitively,competition in stock markets should drive out any predictable movement in stock prices. If b were greater than 0, individual investors should buy when x(t) is high and sell when x(t) is low. However if every market participant adhered to this logic, prices would be driven up (down) on the basis of a high (low) x(t) until today’s price is the same as the price expected in the next period. The more competitive an asset market, the greater the degree of informational efficiency,the greater the speed at which pertinent material information is impounded in today’s stock price, the lower the likelihood of predicting stock price movements.

Implicit in these statements is a distinction between individual versus equilibrium perspectives. Assuming b>0, an individual would look towards a positive signal (x) as an indication to buy the stock and reap subsequent returns. If everybody followed this same logic, stock prices would be bid up until current and expected prices are equal and the incentive for further action eliminated. An asset for which b>0 cannot describe a market in equilibrium. It is tempting but erroneous to conclude that predictability implies inefficient markets. Nothing in the definition of market efficiency requires that expected returns remain constant over time.In fact, predictability is equivalent to variation in expected returns. Consider a regression model of excess stock returns on the Div/Price ratio over time :

gki

Taking expectations,this means that at time t, the expected returns is:

aw

This implies that expected returns vary over time and are not constant as long as returns are predictable (b not equal to 0). As far as I understand this, if the regression produces a zero slope coefficient, then expected returns would be reduced to a constant (a). Hence,as long as the estimated slope coefficient is non zero, one can link variation in expected returns to predictability of returns. As usual, all these posts may be riddled with mistakes and misunderstandings on my part;far better to look at the sources for yourself : Discount Rates,Chicago Lectures.

The first generation of tests suggest that basically nothing can forecast stock returns. The following section gives a sense of this from regressions of returns on lagged returns for : [a] CSRP stock data ; [b] T-bill rate ; [c] Excess returns. Observations are annual for the period from 1952 to 2008. Data sources come from the Chicago lecture problem sets linked to above. Additional variables such as the D/P,Div Growth and Cay are also imported for later use.

l

#####################################################################################
#Data import & handling
#####################################################################################

#Import CSV file(s)
ret.csv<- read.csv('ps1_data.csv',header=T)
cay.csv <- read.csv('cay.csv',header=T)
ret.ts <- ts(data=ret.csv,start=c(1926),end=c(2011),frequency=1)
cay.ts <- ts(data=cay.csv,start=c(1952,1),end=c(2009,1),frequency=4)

#Subset data
row.idx <- 1:nrow(cay.ts)
ret.sub <- window(ret.ts,start=c(1952),end=c(2008),frequency=1)
cay.sub <- ts(cay.ts[row.idx%%4==0,],start=c(1952),end=c(2008),frequency=1)

#Save individual series
idx.ret <- ret.sub[,1]
dp <- ret.sub[,2]
dg <- ret.sub[,3]
tbill <- ret.sub[,4]
cay <-cay.sub[,4]
exc <- idx.ret-tbill

#Save log series
log.ret <- log(1+idx.ret)
log.dp <- log(dp)
log.dg <- log(1+dg)
rho=0.96
###################################################################################

l

Running regressions of stock,tbill,excess returns on their one period lagged values :

###################################################################################
# Initial Regression & Test
###################################################################################

#Initial regressions on lagged values
idx.temp <- cbind(idx.ret,lag(idx.ret,-1))
idx.reg<-lm(idx.temp[,1]~idx.temp[,2])

tbill.temp <- cbind(tbill,lag(tbill,-1))
tbill.reg<-lm(tbill.temp[,1]~tbill.temp[,2])

exc.temp <- cbind(exc,lag(exc,-1))
exc.reg<-lm(exc.temp[,1]~exc.temp[,2])

#Extract regression values
reg.list <- list(idx.reg,tbill.reg,exc.reg)
estim <-RegExtractor(reg.list,'est')
pvals <-matrix(RegExtractor(reg.list,'pval'),ncol=2)
tvals <-matrix(RegExtractor(reg.list,'tval'),ncol=2)
rsq <-matrix(RegExtractor(reg.list,'rsq'),ncol=1)

#Variation of expected returns with the level of expected returns
std.vals=sapply(reg.list,function(v) sd(fitted(v)))*100
mean.vals=sapply(reg.list,function(v) mean(fitted(v)))*100

#Plots&Table
asset.names<-c('Returns','Tbill','Excess')
col.names<-c('Intercept','Beta','t-stats','p-vals','R2','E(R)','Std(Fitted)')
reg.table=cbind(estim,(tvals[,2]),(pvals[,2]),rsq,mean.vals,std.vals)

windows()
layout(matrix(c(1,1,2),nrow=3))
plot(idx.ret,ylab='Returns',xlab='',cex.main=0.85,cex.lab=0.75,cex.axis=0.75,main='Timeseries plots',col='darkgreen',type='l',lwd=2)
lines(tbill,col='blue')
lines(exc,col='red')
legend(bty='n',y.intersp=1,'bottomleft',fill=c('darkgreen','blue','red'),legend=asset.names,ncol=1,bg='white',cex=0.75)

est.tab <- round(reg.table,5)
est.tab <- apply(est.tab, 2, rev)
est.tab <- cbind(rev(asset.names),est.tab)
par(mai=c(0.35,0.15,0.5,0.15))
TableMaker(row.h=1,est.tab,c('Single Period',col.names),strip=F,strip.col=c('green','blue'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='gold',header.tcol='black',title=c('Regression of returns on lagged returns\nAnnual data for 1952-2008'))
###################################################################################

cc

The table beneath the time series plot summarises the typical regression estimates. Stock returns are basically unpredictable,if returns go up 100% this year,one would expect a rise of just about 4% next year,a trivial amount of momentum. While the slope coefficient is statistically insignificant this has no significance in terms of economic interpretation. The R-squared which measures the proportion of return variance that can be forecast one year ahead is tiny. Results for the t-bill regression provide an interesting contrast with a large slope parameter estimate that is statistically significant with a R-squared of 72%. This is visually corroborated by the relatively smooth time series plot for interest rate returns. The last row in the table presents results for the excess return regression. The results clearly mimic that of the pure stock return. Excess return separates willingness to consume less and save from the willingness to bear risk. The last 2 columns of the table compare how much expected returns vary over time with the level of expected returns. This can be thought of as another economic measure of how much expected returns vary over time. While these findings may have been revolutionary in the 1970s,they were upended by two new facts a decade later: [a] long horizon regressions and [b] new variables.

In part (b) of the Macroeconomic factor model post, I shall first deal with the Fama/MacBeth methodology in the context of a rolling window of fixed size. I have mentioned previously that this experiment involves a simplified version of the estimation procedure whereby assets are not sorted into equally sized portfolios based on pre-ranking betas, a step in the process originally intended to reduce the error in variables problem. The error in variables problem arises from the use of estimated betas (obtained in the time series regressions) as independent variables in the subsequent cross section regression (that constitutes the second step of the FB-procedure). I have only recently discovered the function rollapply,which automatically deals with rolling windows forward through a dataset, hence the code will be quite a bit different from that in the Fama-MacBeth post from before, where I did account for the error in variables problem by grouping assets into 20 portfolios according to their pre-ranking betas. Before making a distinction between the rolling-and the full sample analysis, it makes sense to first setup the environment, define custom functions and extract requisite data subsets.

g

[ Set up the libraries, custom functions & datasets ]

The usual (and somewhat dated) TableMaker function makes a comeback,allowing us to ‘plot’ tabulated data in a specific way. I have also defined a new custom function called RegExtractor which is used to extract regression estimates,standard errors,t-values,p-values,R-squared and residuals from a list object of regressions.
f

setwd("C:/University of Warwick/R experiments/Asset Pricing")

library(quantmod)
library(tseries)
library(timeSeries)
library(MASS)
library(ggplot2)
library(reshape)

#######################################################################################
#Table Maker
#######################################################################################

TableMaker <- function(row.h,core,header,strip,strip.col,col.cut,alpha,border.col,text.col,header.bcol,header.tcol,title)
{
	rows <- nrow(core)+1
	cols <- ncol(core)
  colours <- adjustcolor(col=strip.col,alpha.f=alpha)
  idx <- 1

	plot.new()
  plot.window(xlim=c(0,cols*11),ylim=c(0,rows*7))
  mtext(title,side=3,cex=0.7)

	for(i in 1:cols){
		for(j in 1:rows){
			if(j==1){
       rect((i-1)*11,rows*7-7,11*i,rows*7,density=NA,col=header.bcol,lty=1,border=border.col)
       text(((i-1)*11+11*i)/2,((rows*7-7)+rows*7)/2,header[i],cex=0.8,col=header.tcol,font=1)
			}
			else if((j>1) && (row.h==0))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=ifelse(strip,ifelse(core[j-1,i]<col.cut,colours[1],colours[2]),'white'),lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
      }
			else if((j>1) && (row.h!=0) && (i==row.h[idx]))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=border.col,lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
			}
			else if((j>1) && (row.h!=0) && (i!=row.h[idx]))
      {
       rect((i-1)*11,(j-1)*7-7,11*i,(j-1)*7,density=NA,col=ifelse(strip,ifelse(core[j-1,i]<col.cut,colours[1],colours[2]),'white'),lwd=1.0,border=border.col)
       text(((i-1)*11+11*i)/2,(((j-1)*7-7)+(j-1)*7)/2,core[j-1,i],col='black',cex=0.75,font=1)
      }
		}
     ifelse(idx<length(row.h),idx<-idx+1,idx<-length(row.h))

	}
}
#########################################################################################

#########################################################################################
#Extract Regression Values
#########################################################################################
RegExtractor <- function(x,type)
{
	if(type=='est'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,1]))
	}
	else if (type=='stderr'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,2]))
	}
  else if(type=='tval'){
   output <- t(sapply(x,function(v)coef(summary(v))[,3]))
  }
	else if(type=='pval'){
	 output <- t(sapply(x,function(v)coef(summary(v))[,4]))
	}
	else if(type=='rsq'){
	 output <- t(sapply(x,function(v)(summary(v))$r.squared))
	}
	else if(type=='res'){
	 output <- sapply(x,function(v)(resid(v)))
	}
	return(output)
}
#########################################################################################

f

The data used in this experiment involves : [1] The macrofactors defined earlier ; [2] The excess return on the SP500 index ; [3] Excess returns on a random sample of 25 stocks. These datasets are subsetted so they cover a total period spanning Jan-1995 to Dec-2010
k

#########################################################################################
#Data import & handling
#########################################################################################

#Import CSV file(s)
macrofactors<- read.csv('macrofactors.csv',header=T)
load(file='SP500 components.RData')
ff.df <- read.csv('Factor_data.csv',header=T)
ff.tot.data <- ts(data=ff.df,start=c(1926,7),end=c(2012,12),frequency=12)/100

monthly.ret <- prices.vect$monthly.ret
ret.data <- monthly.ret[,sample(25)]
asset.names <- colnames(ret.data)

#Subset data to
sub.c.1 <- c(1995,1)
sub.c.2 <- c(2010,12)
macrofactors.data <-ts(data=macrofactors,start=sub.c.1,end=sub.c.2,frequency=12)
test.assets <-ts(data=ret.data,start=sub.c.1,end=sub.c.2,frequency=12)
rf.ret <- window(ff.tot.data[,4],start=sub.c.1,end=sub.c.2)
exc.mkt <- window(ff.tot.data[,1],start=sub.c.1,end=sub.c.2)
mkt <- exc.mkt
macrofactors.data <- cbind(macrofactors.data,mkt)

test.assets <- test.assets-rf.ret
colnames(macrofactors.data) <- c(colnames(macrofactors),'MKT')
timestamp <- seq(as.Date('1995-01-01'),as.Date('2010-12-31'),by='months')
#########################################################################################

k
Having setup the environment,defined the functions and obtained the necessary data subsets, let’s first apply the FB procedure in the context of a rolling window analysis.

f

[ Rolling Regression Analysis ]

In the context of a rolling analysis, the FB procedure can be summarised as follows:

[1] Run a time series regression of asset (i) on previously defined factors on the basis of a window of size 60 (60 monthly observations are used for regressions) and store the results (estimates,p-values) in list objects before rolling the window of fixed length forward by 1 year.

[2] Run a cross section regression for each month across the 25 assets on estimated betas in step [1]. These regressions are run on the 1 year worth of observations separating the end of the previous window and the start of the next window. 

[3] Save timeseries of premia estimates and residuals from step [2],take mean and test for signficance.

f

###################################################################################
# Rolling Regression
###################################################################################
#timeseries regressions
roll.reg <- list()
pval.reg <- list()
window.end <- NULL

for(i in 1:ncol(test.assets)){
	data.mat=cbind(test.assets[,i],macrofactors.data)
  roll.reg[[i]]=as.matrix(na.remove(rollapply(data.mat,width=60,FUN=function(x) coef(lm(x[,1]~x[,-1])),by.column=F,align='right',by=12)),ncol=7)
  colnames(roll.reg[[i]])=c('Intercept',colnames(macrofactors.data))
	pval.reg[[i]]=matrix(na.remove(rollapply(data.mat,width=60,FUN=function(z) coef(summary(lm(z[,1]~z[,-1])))[,4],by.column=F,align='right',by=12)),ncol=7)
  colnames(pval.reg[[i]])=c('Intercept',colnames(macrofactors.data))
}

#visualise regression windows
window.idx <- as.character(1:12)
window.st <- as.character(timestamp[seq(1,133,12)])
window.end <- as.character(timestamp[seq(60,192,12)])
window.diff <- abs(timestamp[seq(1,133,12)]-timestamp[seq(60,192,12)])/365
window.tot <- cbind(window.idx,window.st,window.end,window.diff)
colnames(window.tot) <- c('Regression window','Start','End','Years of observations')

windows()
windows.tab <- window.tot
		windows.tab <- apply(windows.tab, 2, rev)
		par(mai=c(0.5,0.35,0.5,0.35))
		TableMaker(row.h=1,windows.tab,colnames(window.tot),strip=F,strip.col=c('green','red'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='gold',header.tcol='black',title=c('5 Year Regression Windows For Timeseries Regressions'))

#crosssection regression
cross.betas <-list()
cross.ret <-list()
cross.reg <- list()

ret.idx <- matrix(c(seq(from=61,to=nrow(test.assets),by=12),seq(from=72,to=nrow(test.assets),by=12)),ncol=2,byrow=F)
for(i in 1:nrow(ret.idx)){
	cross.ret[[i]]=t(test.assets[ret.idx[i,1]:ret.idx[i,2],])
}

for(i in 1:nrow(roll.reg[[1]])){
	cross.betas[[i]]=roll.reg[[i]][1,-1]
	for(j in 2:ncol(test.assets)){
		cross.betas[[i]]=rbind(cross.betas[[i]],roll.reg[[j]][i,-1])
  }
}

#Computing t-stats
intercept.ts<-NULL
riskpremia.ts<-NULL
resid.ts<-NULL

for(i in 1:nrow(ret.idx)){
	cross.reg[[i]]=lm(cross.ret[[i]]~cross.betas[[i]])
  intercept.ts=rbind(intercept.ts,as.matrix(coef(cross.reg[[i]])[1,]))
	riskpremia.ts=cbind(riskpremia.ts,coef(cross.reg[[i]])[-1,])
	resid.ts=cbind(resid.ts,resid(cross.reg[[i]]))
}

intercept.mean<-mean(intercept.ts)
riskpremia.mean<-rowMeans(riskpremia.ts)
resid.mean<-rowMeans(resid.ts)

num.t<-nrow(intercept.ts)
intercept.std <- apply(intercept.ts,2,sd)
riskpremia.std <- apply(riskpremia.ts,1,sd)
resid.std <- apply(resid.ts,1,sd)

tstat.intercept <- (intercept.mean/(intercept.std/(num.t^0.5)))
tstat.premia <- (riskpremia.mean/(riskpremia.std/(num.t^0.5)))
tstats.avg <- round(as.matrix(c(tstat.intercept,tstat.premia)),4)
rownames(tstats.avg) <- c('Intercept',colnames(macrofactors.data))
colnames(tstats.avg) <- 'T-Test'

#Visualise
prem <- t(matrix(riskpremia.ts,nrow=6))
prem <- round(cbind(intercept.ts,prem),4)
beg.seq <- seq(1,121,by=12)
end.seq <- seq(12,132,by=12)
lhs <- rev(c('Intercept',colnames(macrofactors.data)))
cols <- rainbow(20)[10:16]

png(file="premia%02d.png",width=550,height=500)
i<-1
for(i in 1:length(beg.seq)){
prem.tab <- cbind(t(prem[(beg.seq[i]:end.seq[i]),]),tstats.avg)
		prem.tab <- apply(prem.tab,2,rev)
    prem.tab <- cbind((lhs),prem.tab)
   	layout(matrix(c(1,1,2),ncol=1))
    par(mai=c(0.30,0.55,0.5,0.3))
    plot(xlim=c(-3,12),ylab='Estimates',cex.main=0.85,cex.lab=0.75,cex.axis=0.75,main=paste('12 month forward crossectional regression estimates\n using data for ',timestamp[ret.idx[i,1]],'to',timestamp[ret.idx[i,2]]),prem[(beg.seq[i]:end.seq[i]),1],type='l',lwd=1.75,col='red')
	  for(j in 2:7){
	  	lines(prem[(beg.seq[i]:end.seq[i]),j],lwd=1.75,col=cols[j])
	  }
	  abline(v=0,col='black',lwd=1)
	  legend(bty='n',y.intersp=1,'topleft',fill=c('red',cols),legend=c('INT - Intercept','MP - Chg. Industrial Production','UI - Unanticipated Infl','DEI - Chg.Expected Infl','UTS - Unanticipated ret on LT Bonds','UPR - BAA minus AAA yield','MKT - Market Returns'),ncol=1,bg='white',cex=0.75)
	  par(mai=c(0.15,0.25,0.10,0.05))
	  TableMaker(row.h=1,prem.tab,c(paste('Window:\n',i),month.name,'T-Test'),strip=F,strip.col=c('green','blue'),col.cut=0.05,alpha=0.7,border.col='lightgrey',text.col='black',header.bcol='blue',header.tcol='white',title=c(''))
}
dev.off()
shell("convert -delay 10 *.png premia.gif")

k
For convenience I have summarised the regression windows and the associated beginning and end periods using the TableMaker function as follows :

ap1

I have also animated the riskpremia per regression window for each factor (essentially the timeseries of risk premia obtained in step [2]) as well as the T-statistics for averaged values for the entire series. I finally figured out why some of my previous gifs did not animate as expected,apparently one has to make sure that the size of the gif is about 500 and not resize the gif within wordpress (i.e. use full size option when importing it). Although I do wish that the quality of the gif was a bit better…perhaps i am doing something wrong.

premia

The next post will apply the same methodology on the full sample rather than on rolling 60 month windows. From the last column of the table above,it appears that the intercept,UI and DEI premia are statistically significant at the 10% and 5% levels respectively.