Completely forgotten that I had an unfinished shinyR project lying around and which, however unlikely I am to continue with R anytime soon, deserves at the very least to be mentioned in this category of tidbits & reminders.While forcing myself through the international edition (read : cheap from india) of Microeconomic Theory : Basic Principles and Extensions,some time ago now, I also attempted to replicate some basic principles using the reactive framework of Shiny.Pretty sure i never finished the book.The result is a hodgepodge of a shiny app containing the following :

  • Consumer Theory
    • Preferences
      • Cobb-Douglas
      • Slutsky Decomposition and Lump Sum Principle
    • Marshallian Demand Curves
      • Income held constant
    • Hicksian Demand Curves
      • Utility held constant
    • Comparing Demand Curves
    • Welfare Changes
      • Compensating Variation
      • Equivalent Variation
    • Income-Consumption Curve
      • Income effect
    • Choice Under Uncertainty
      • Quadratic Utility Function
      • Power Utility Function
      • General Exponential Utility Function
      • Log Utility
      • Hyperbolic Absolute Risk Aversion
  • Production & Supply
    • Production Function
      • Production Hill
      • Isoquant
      • Product curves
    • Cost Function
      • Choice of production function
        • Cobb-Douglas,Linear,Fixed Proportion,CES
      • Isoquant,Isocost,Output Expansion Path
      • Total cost curves
      • Long-run envelope & Short-run cost curves
      • Short run total/average cost decomposition
    • Short-run Profits & Surplus
      • Choice of production function
        • Cobb-Douglas,Linear,Fixed Proportion,CES
      • Substitution and Output effect

The list above is just taken from the titles/description/dropdownboxes found in the app itself and is not indicative of what the app can reliably do.Apart from being exceedingly slow most of the time,pretty sure the more exotic choices(eg : fixed proportion) do not work. Flashbacks of the computer being frozen for an hour or so are stopping me from testing which of these choices to avoid and recommend. Best to keep the production function choices to the cobb-douglas specification it defaults to in each case and keep slider changes small and slow.

For the brave and foolish,you can download the files here :  ui,server,helpers

For the more discerning of taste,a gallery of screenshots from the app follows:

 

Seeing these screen caps has reminded me of a host of other problems with this app :

  1. Changing the numbers too drastically may sometimes lead to strange plots.
  2. The Abs/Rel risk aversion graphs are strange and definitely wrong for certain cases.

Well.It is what it is.

 

In the previous post I mentioned how the process of removing problematic variables was itself incomplete as I removed them from all models as opposed to just the subset of models for which they give strange fitting results. I have since made the requisite changes to the previous series of posts. Removing problematic funds becomes :

 

#Removing problemtic funds for specific models only
problemRemove <- function(list.element){
	problems <- which(is.na(list.element==0) | list.element<0)
	return(list.element[-problems,])
}

model.list <- lapply(model.list,problemRemove)
subfund.names <- as.matrix(unlist(lapply(model.list,rownames)))
temp <- apply(do.call(rbind,model.list),2,unlist)

l

What counts as problematic in my case is when there are fitting problems or when the selection criteria are negative. It should be noted that there is nothing wrong with a negative AIC or SBC,I just designated negative values as problematic in this example.

 

Applying the plotting functions yields :

bnn

 

 

Let’s compare this to the case where we delete the problematic variables from all models :

bnh

 

In the latter case we deleted 4 funds across ALL models because they had model fitting issues across SOME models. In the former case we retained all funds even though some had problems in the fitting process (gaps in facetted barcharts). This evidently changes the percentage of times a particular model is the best performer.

[2] Nested lists and storing regressions results

A way to store regression results for multiple models and/or variables that I have come across involves the use of nested list objects. This allows me to fit a variety of models and store their results across multiple variables using a for loop. The easiest way to visualise nested list objects is to imagine them as folders in windows explorer, with each subfolder (or item contained therein) being separated by the $-sign. A sequence of steps that I have found quite useful is as follows :

  1. Load and clean data.
  2. Fit n models to data and store regression results in a nested list object such that
    • each element of the list represents one variable.
    • each element contains the fitted results of all n models.
  3. Create new list object,this time looping across models such that
    • each element of the list is one of n models.
    • each element contains desired regression results for each model.
    • information is stored in ggplot-friendly format.
  4. Remove problematic variables from each element of that list.
    • This does not work all the time for some strange reason (i.e.sometimes need to manually remove problematic variables)
    • In this step I remove the problematic variable for all models rather than for specific models,something that needs to be improved upon.
  5. Use do.call with rbind across elements of this newly created list to create the base dataframe.
  6. The base dataframe can then be manipulated to create other sub dataframes.
  7. Apply ggplotting functions on sub dataframes.

This procedure has been applied in the previous set of posts. I have doctored up some bogus dataset to provide an additional example so that I don’t forget :

l

#[1] Simulate data

Feed.list <- c('caviar','cayenne.pepper','celery','cereal','chard','cheddar','cheese','cheesecake','chef','crisp','cellophane','coffee')
sample.df <- data.frame()
for(i in 1:length(Feed.list)){
	sample.df[1:100,i] <- rnorm(100)
}
colnames(sample.df) <- Feed.list
sample.xts <- xts(order.by=seq(as.Date('1997-01-31'),length=100,by=1),sample.df)

#[2] Nested Lists and regression

n.vars <- ncol(sample.xts)
Feed.list <- names(sample.xts)
n.obs <- nrow(sample.xts)
n.models <- 9

main.list <- list()
	for(i in 1:n.vars){
		main.list[[i]] <- list()
		names(main.list)[i] <- Feed.list[i]
		main.list[[i]]$Sample.xts <- sample.xts[,i]
		temp.xts <- sample.xts[,i]

			#Independent Log-normal model
			main.list[[i]]$ILN <- lm(temp.xts~1)
		  main.list[[i]]$ILN$Stat <- StatComp(main.list[[i]]$ILN,n.obs)

		  main.list[[i]]$ILN$LogLik <- main.list[[i]]$ILN$Stat['model.loglik']
		  main.list[[i]]$ILN$df <- main.list[[i]]$ILN$Stat['model.df']
		  main.list[[i]]$ILN$AIC <- main.list[[i]]$ILN$Stat['model.aic']
		  main.list[[i]]$ILN$SBC <- main.list[[i]]$ILN$Stat['model.sbc']

		  #Autoregressive model
		  main.list[[i]]$AR1 <- arima(x=temp.xts,order=c(1,0,0))
		  main.list[[i]]$AR1$Stat <- StatComp(main.list[[i]]$AR1,n.obs)

		  main.list[[i]]$AR1$LogLik <- main.list[[i]]$AR1$Stat['model.loglik']
		  main.list[[i]]$AR1$df <- main.list[[i]]$AR1$Stat['model.df']
		  main.list[[i]]$AR1$AIC <- main.list[[i]]$AR1$Stat['model.aic']
		  main.list[[i]]$AR1$SBC <- main.list[[i]]$AR1$Stat['model.sbc']

		  #ARCH model
			arch.spec <- ugarchspec(variance.model=list(garchOrder=c(1,0)),mean.model = list(armaOrder=c(0,0)))
		  arch.fit<- ugarchfit(spec=arch.spec, data=temp.xts,solver.control=list(trace = 1))

		  main.list[[i]]$ARCH$LogLik <- likelihood(arch.fit)
		  main.list[[i]]$ARCH$df <- 3
		  main.list[[i]]$ARCH$AIC <- main.list[[i]]$ARCH$LogLik-main.list[[i]]$ARCH$df
		  main.list[[i]]$ARCH$SBC <- main.list[[i]]$ARCH$LogLik-0.5*main.list[[i]]$ARCH$df*log(n.obs)

	    #AR-ARCH model
			arch.spec <- ugarchspec(variance.model=list(garchOrder=c(1,0)),mean.model = list(armaOrder=c(1,0)))
		  arch.fit<- ugarchfit(spec=arch.spec, data=temp.xts,solver.control=list(trace = 1))

		  main.list[[i]]$AR.ARCH$LogLik <- likelihood(arch.fit)
		  main.list[[i]]$AR.ARCH$df <- 4
		  main.list[[i]]$AR.ARCH$AIC <- main.list[[i]]$AR.ARCH$LogLik-main.list[[i]]$AR.ARCH$df
		  main.list[[i]]$AR.ARCH$SBC <- main.list[[i]]$AR.ARCH$LogLik-0.5*main.list[[i]]$AR.ARCH$df*log(n.obs)

	    #GARCH model
			garch.spec <- ugarchspec(variance.model=list(garchOrder=c(1,1)),mean.model = list(armaOrder=c(0,0)))
		  garch.fit<- ugarchfit(spec=garch.spec, data=temp.xts,solver.control=list(trace = 1))

		  main.list[[i]]$GARCH$LogLik <- likelihood(garch.fit)
		  main.list[[i]]$GARCH$df <- 4
		  main.list[[i]]$GARCH$AIC <- main.list[[i]]$GARCH$LogLik-main.list[[i]]$GARCH$df
		  main.list[[i]]$GARCH$SBC <- main.list[[i]]$GARCH$LogLik-0.5*main.list[[i]]$GARCH$df*log(n.obs)

		  #AR-GARCH model
			garch.spec <- ugarchspec(variance.model=list(garchOrder=c(1,1)),mean.model = list(armaOrder=c(1,0)))
		  garch.fit<- ugarchfit(spec=garch.spec, data=temp.xts,solver.control=list(trace = 1))

		  main.list[[i]]$AR.GARCH$LogLik <- likelihood(garch.fit)
		  main.list[[i]]$AR.GARCH$df <- 5
		  main.list[[i]]$AR.GARCH$AIC <- main.list[[i]]$AR.GARCH$LogLik-main.list[[i]]$AR.GARCH$df
		  main.list[[i]]$AR.GARCH$SBC <- main.list[[i]]$AR.GARCH$LogLik-0.5*main.list[[i]]$AR.GARCH$df*log(n.obs)

		  #ILN 2 regimes
		  model.spec <- depmix(eval(parse(text=Feed.list[i]))~1,nstates=2,data=temp.xts)
	    model.fit <- fit(model.spec)
		  main.list[[i]]$ILN.Reg2$Stat <- StatComp(model.fit,n.obs)

      main.list[[i]]$ILN.Reg2$LogLik <- main.list[[i]]$ILN.Reg2$Stat['model.loglik']
		  main.list[[i]]$ILN.Reg2$df <- main.list[[i]]$ILN.Reg2$Stat['model.df']
		  main.list[[i]]$ILN.Reg2$AIC <- main.list[[i]]$ILN.Reg2$Stat['model.aic']
		  main.list[[i]]$ILN.Reg2$SBC <- main.list[[i]]$ILN.Reg2$Stat['model.sbc']

	    #ILN 3 regimes
		  model.spec <- depmix(eval(parse(text=Feed.list[i]))~1,nstates=3,data=temp.xts)
	    model.fit <- fit(model.spec)
	  	main.list[[i]]$ILN.Reg3$Stat <- StatComp(model.fit,n.obs)

      main.list[[i]]$ILN.Reg3$LogLik <- main.list[[i]]$ILN.Reg3$Stat['model.loglik']
		  main.list[[i]]$ILN.Reg3$df <- main.list[[i]]$ILN.Reg3$Stat['model.df']
		  main.list[[i]]$ILN.Reg3$AIC <- main.list[[i]]$ILN.Reg3$Stat['model.aic']
		  main.list[[i]]$ILN.Reg3$SBC <- main.list[[i]]$ILN.Reg3$Stat['model.sbc']

      #AR-Regime switch model
		  temp.df <- data.frame(temp.xts[2:n.obs],temp.xts[2:n.obs-1])
		  names(temp.df) <- c('left','right')

		  model.spec <- depmix(left~right,nstates=2,data=temp.df)
	    model.fit <- fit(model.spec)

			main.list[[i]]$AR.Reg2$Stat <- StatComp(model.fit,n.obs)

      main.list[[i]]$AR.Reg2$LogLik <- main.list[[i]]$AR.Reg2$Stat['model.loglik']
		  main.list[[i]]$AR.Reg2$df <- main.list[[i]]$AR.Reg2$Stat['model.df']
		  main.list[[i]]$AR.Reg2$AIC <- main.list[[i]]$AR.Reg2$Stat['model.aic']
		  main.list[[i]]$AR.Reg2$SBC <- main.list[[i]]$AR.Reg2$Stat['model.sbc']

	}

#[3] Saving across models & removing problematic variables
model.names <- names(main.list[[1]])[-1]
model.list <- list()
problems <- NULL

for(i in 1:n.models){
	model.list[[i]] <- list()
	names(model.list)[i] <- model.names[i]
	model.list[[i]] <- do.call(rbind,lapply(main.list,function(x) eval(parse(text=paste('x$',model.names[i],sep='')))))[,c('AIC','SBC','df','LogLik')]
  problems <- c(problems,which(is.na(model.list[[i]]==0)))
  model.list[[i]] <- cbind(model.list[[i]],Model=rep(model.names[i],n.vars))
}

problem.vars <- unique(problems)
ifelse(length(problem.vars)==0,print('No issues in variables'),model.list<-lapply(model.list,function(x) x[-problem.vars,]))
subFeed.list <- rownames(model.list[[1]])

original.df <- apply(do.call(rbind,model.list),2,unlist)

l
The original.df is the base dataframe which is then manipulated to create sub-dataframes, inputs to ggplotting functions. Ultimately this leads to a set of plots like :

 

newest

 

I have grown a bit tired of R and the blind ritual of haphazardly replicating things here and there. In this post I will collect some tidbits and reminders so as to make it easier for myself to pick this language up again if and when I can force myself to do so.

[1] GGplot Table Drawer

In the previous series of posts I used the custom (sometimes problematic) table plotter function to visualise model results in tabulated form. I have since made some changes to this function which i shall summarise here.

Implementation is simple :

  1. ggTableSpec() function is used to specify various elements of the table we wish to create :
    • interpretation of function arguments is straightforward. The only issue here worth mentioning is that the hlt.row / hlt.col arguments are swapped.If one wants to highlight rows (cols) then hlt.col (hlt.row) should be specified. Just a quirk that I cannot be bothered to change at the moment. This function returns a list object.
  2. ggTableDrawer() function uses the output of the ggTableSpec() function to plot the desired table using ggplot2 functions.

The code for both functions is given here :

l

#################################################################################################################
# Specify Elements Of the Table
#################################################################################################################
ggTableSpec <- function(columns.exist,columns.txt,columns.font,columns.col,columns.fill,columns.alpha,
		                    rows.exist,rows.txt,rows.font,rows.col,rows.fill,rows.alpha,
		                    data.obj,data.col,data.title,
		                    hlt.col.exist,hl.col.which,hl.col.fill,hl.col.alpha,
		                    hlt.row.exist,hl.row.which,hl.row.fill,hl.row.alpha
		                    ){
#Construct the Title Layer
	Title.Layer <- list()

		Title.Layer$Columns <- list()
			if(columns.exist){
				Title.Layer$Columns$Exist <- TRUE
				Title.Layer$Columns$Txt  <- columns.txt
  			Title.Layer$Columns$Font <- columns.font
  			Title.Layer$Columns$Col <- columns.col
  			Title.Layer$Columns$Fill <- columns.fill
  			Title.Layer$Columns$Alpha <- columns.alpha
			}else{
				Title.Layer$Columns$Exist <- FALSE

		  }

	  Title.Layer$Rows <- list()
	  	if(rows.exist){
	  		Title.Layer$Rows$Exist <- TRUE
				Title.Layer$Rows$Txt  <- rows.txt
  			Title.Layer$Rows$Font <- rows.font
  			Title.Layer$Rows$Col <- rows.col
  			Title.Layer$Rows$Fill <- rows.fill
  			Title.Layer$Rows$Alpha <- rows.alpha
	  	}else{
	  		Title.Layer$Rows$Exist <- FALSE
	  }

#Construct Data Layer
	Data.Layer <- list()
	 Data.Layer$Txt <- data.obj
	 Data.Layer$Col <- data.col
	 Data.Layer$Title <- data.title

#Construct Highlight Layer
	Highlight.Layer <- list()

		Highlight.Layer$Columns <- list()
			if(hlt.col.exist){
				Highlight.Layer$Columns$Exist <- TRUE
				Highlight.Layer$Columns$Which <- hl.col.which
				Highlight.Layer$Columns$Fill  <- hl.col.fill
				Highlight.Layer$Columns$Alpha <- hl.col.alpha
			}else{
				Highlight.Layer$Columns$Exist <- FALSE
			}

		Highlight.Layer$Rows <- list()
			if(hlt.row.exist){
				Highlight.Layer$Rows$Exist <- TRUE
				Highlight.Layer$Rows$Which <- hl.row.which
				Highlight.Layer$Rows$Fill  <- hl.row.fill
				Highlight.Layer$Rows$Alpha <- hl.row.alpha
			}else{
				Highlight.Layer$Rows$Exist <- FALSE
			}

gg.table.spec <- list(Title.Layer=Title.Layer,Data.Layer=Data.Layer,Highlight.Layer=Highlight.Layer)
return(gg.table.spec)
}
########################################################################################################

#################################################################################################################
# Draw Table
#################################################################################################################

ggTableDrawer <- function(gg.table.spec){
	#Data Coordinates & Dataframe
	  e<- environment()
	  data.obj <- apply(gg.table.spec$Data.Layer$Txt,2,rev)
	  data.col <- gg.table.spec$Data.Layer$Col

		xmin <- 1
	  ymin <- 1

	  xmax <- ncol(gg.table.spec$Data.Layer$Txt)
	  ymax <- nrow(gg.table.spec$Data.Layer$Txt)

	  x.adj <-1
	  y.adj <-1
	 	lab.adj <- 0.05

	 if(gg.table.spec$Title.Layer$Rows$Exist){
	 	txt.temp <- gg.table.spec$Title.Layer$Rows$Txt
	 	max.char <- max(nchar(txt.temp))
	 	empty.adj.min<-0.1*max.char

	 	}else{
	 		empty.adj.min<-0
	 	}

	  empty.layer.adj <- 1
	  temp.seq <- seq(xmin,xmax,length=xmax)

	  DataLayer.df <- data.frame(data.obj,ycoord=1:ymax,stringsAsFactors=F)
			for(i in 1:length(temp.seq)){
				DataLayer.df <- cbind(DataLayer.df,rep(temp.seq[i]+5*lab.adj,ymax))
			}
	  colnames(DataLayer.df)[(xmax+2):(xmax+2+length(temp.seq)-1)] <- paste('xcoord',1:xmax,sep='')

	  parse.temp <- colnames(DataLayer.df)[(xmax+2):(ncol(DataLayer.df))]
	  parse.coord <- paste('c(',paste(parse.temp,collapse=','),')',sep='')

	  parse.temp <- colnames(gg.table.spec$Data.Layer$Txt)
		parse.lbl <- paste('c(',paste(parse.temp,collapse=','),')',sep='')

	  parse.ycoord <- paste('c(',paste(rep('ycoord',xmax),collapse=','),')',sep='')

	  EmptyLayer <- ggplot(data=DataLayer.df)+
		                geom_blank()+
		                xlim(xmin-empty.adj.min,xmax+empty.layer.adj)+
		                labs(title=gg.table.spec$Data.Layer$Title)+
                    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),panel.border = element_blank(), panel.background = element_blank(), axis.title.x = element_blank(),axis.title.y = element_blank(), axis.text.x = element_blank(), axis.text.y = element_blank(),axis.ticks = element_blank())

	  DataLayer <- geom_text(data=DataLayer.df,aes_string(y=parse.ycoord,x=parse.coord,label=parse.lbl,hjust=0,vjust=1),size=rel(3),colour=data.col)

	#Title coordinates & Dataframe
  	#Columns
			if(!gg.table.spec$Title.Layer$Columns$Exist){
				Title.Column.Layer <- NULL
				Rect.Column.Layer <- NULL
				Rect.Column.df <- data.frame(xmin=xmin,xmax=xmax+empty.layer.adj,ymin=ymax,ymax=ymax+0.5)
			}else{
				col.title.adj <- 0.5
				col.title.xmin <- xmin
				col.title.xmax <- xmax
				col.title.ymin <- ymax
				col.title.ymax <- col.title.ymin+col.title.adj
				col.y <- (col.title.ymax+col.title.ymin)/2

				col.lbls <- gg.table.spec$Title.Layer$Columns$Txt
				col.font <- gg.table.spec$Title.Layer$Columns$Font
				fill <- gg.table.spec$Title.Layer$Columns$Fill
				alpha<- gg.table.spec$Title.Layer$Columns$Alpha
				col.colour <- gg.table.spec$Title.Layer$Columns$Col

			  Title.Column.df <- data.frame(lab.x=seq(col.title.xmin,col.title.xmax,length=length(col.lbls)),lab.y=rep(col.y,length(col.lbls)),Text=col.lbls,Font=col.font)
			  Rect.Column.df <- data.frame(xmin=col.title.xmin,xmax=col.title.xmax+empty.layer.adj,ymin=col.title.ymin,ymax=col.title.ymax,fill=fill,alpha=alpha)

				Title.Column.Layer <- geom_text(data=Title.Column.df,aes(x=lab.x,y=lab.y,label=Text,fontface=Font,hjust=0,vjust=0),size=rel(3),colour=col.colour)
				Rect.Column.Layer <- geom_rect(data=Rect.Column.df,aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=+Inf),alpha=alpha,fill=fill)

			}

#Rows
			if(!gg.table.spec$Title.Layer$Rows$Exist){
				Title.Row.Layer <- NULL
				Rect.Row.Layer <- NULL
				Rect.Row.df <- data.frame(xmax=1,ymax=Rect.Column.df$ymin)
			}else{
				row.lbls <- rev(gg.table.spec$Title.Layer$Rows$Txt)
				row.font <- gg.table.spec$Title.Layer$Rows$Font
				fill <- gg.table.spec$Title.Layer$Rows$Fill
				alpha<- gg.table.spec$Title.Layer$Rows$Alpha
				row.colour <- gg.table.spec$Title.Layer$Rows$Col

				quo <- 1/15
				row.title.adj <- max(nchar(row.lbls))*quo
				row.title.xmin <- xmin-row.title.adj
				row.title.xmax <- xmin
				row.title.ymin <- ymin-1
				row.title.ymax <- Rect.Column.df$ymin

			  Title.Row.df <- data.frame(lab.y=DataLayer.df$ycoord,lab.x=rep(row.title.xmin,length(row.lbls)),Text=row.lbls,Font=row.font,stringsAsFactors=F)
				Rect.Row.df <- data.frame(xmin=row.title.xmin,xmax=row.title.xmax,ymin=row.title.ymin,ymax=row.title.ymax,fill=fill,alpha=alpha,stringsAsFactors=F)

				Title.Row.Layer <- geom_text(data=Title.Row.df,aes(x=lab.x,y=lab.y,label=Text,fontface=Font,hjust=0,vjust=1),size=rel(3),colour=row.colour)
				Rect.Row.Layer <- geom_rect(data=Rect.Row.df,aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax),alpha=alpha,fill=fill)
			}

#column highlights
			if(!gg.table.spec$Highlight.Layer$Columns$Exist){
		    Highlight.Column.Layer <- NULL
			}else{
				hl.col.dyn <- 0
			  hl.col.ymin <- ymax-gg.table.spec$Highlight.Layer$Columns$Which+hl.col.dyn
				hl.col.adj <- Rect.Column.df$ymin-ymax

				hl.col.xmin <- Rect.Column.df$xmin
			  hl.col.xmax <- Rect.Column.df$xmax
	      hl.col.ymax <- hl.col.ymin+1

				n.hl <- length(hl.col.ymin)

				fill <- gg.table.spec$Highlight.Layer$Columns$Fill
				alpha<- gg.table.spec$Highlight.Layer$Columns$Alpha

				Highlight.Column.df <- data.frame(xmin=rep(hl.col.xmin,n.hl),xmax=rep(hl.col.xmax,n.hl),ymin=hl.col.ymin,ymax=hl.col.ymax,fill=fill,alpha=alpha,stringsAsFactors=F)
	      Highlight.Column.Layer <- geom_rect(data=Highlight.Column.df,aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax),fill=fill,alpha=alpha)

			}

#row highlights
			if(!gg.table.spec$Highlight.Layer$Rows$Exist){
		    Highlight.Row.Layer <- NULL
			}else{
				hl.row.adj <- 1-Rect.Row.df$xmax
				hl.row.xmin <- rev(gg.table.spec$Highlight.Layer$Rows$Which)-hl.row.adj
			  hl.row.xmax <- rev(gg.table.spec$Highlight.Layer$Rows$Which)+1-hl.row.adj
				hl.row.ymin <- 0
	      hl.row.ymax <- Rect.Row.df$ymax
				n.hl <- length(hl.row.xmin)

				fill <- rev(gg.table.spec$Highlight.Layer$Rows$Fill)
				alpha<- gg.table.spec$Highlight.Layer$Rows$Alpha

				Highlight.Row.df <- data.frame(xmin=hl.row.xmin,xmax=hl.row.xmax,ymin=hl.row.ymin,ymax=hl.row.ymax,fill=fill,alpha=alpha,stringsAsFactors=F)
	      Highlight.Row.Layer <- geom_rect(data=Highlight.Row.df,aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax),fill=fill,alpha=alpha)

			}	

ggtbl <- EmptyLayer +
						Rect.Column.Layer +
						Title.Column.Layer +
						Rect.Row.Layer +
						Title.Row.Layer +
						Highlight.Column.Layer +
						Highlight.Row.Layer +
						DataLayer

return(ggtbl)
}
########################################################################################################

h
Example of implementations on the basis of :
l

#######################################################################################################
# Sample Data Frame
#######################################################################################################
Feed.list <- c('caviar','cayenne pepper','celery','cereal','chard','cheddar','cheese','cheesecake','chef','cherry','chew','chicken','chick peas','chili','chips','chives','chocolate','chopsticks','chow','chutney')
debug.df <- data.frame(Feed=Feed.list,Cats=round(runif(20),2),Dogs=round(runif(20),2),Pigs=round(runif(20),2),Smaug_the_dragon=1000*round(runif(20),2))
rownames(debug.df) <- paste('MODEL',1:20,sep='-')
which.idx <- which(debug.df[,'Smaug_the_dragon']>900)
which.idx2 <- which(debug.df[,'Smaug_the_dragon']<=900)
#######################################################################################################

l
Implementation 1 : Rows and Columns exist
l

#######################################################################################################
# Example 1
#######################################################################################################
debug.spec <- ggTableSpec(columns.exist=T,columns.txt=colnames(debug.df),columns.font=rep('bold',5),columns.col='black',columns.fill='grey',columns.alpha=0.7,
		                      rows.exist=T,rows.txt=rownames(debug.df),rows.font=rep('bold',20),rows.col='black',rows.fill='grey',rows.alpha=0.7,
		                      data.obj=debug.df,data.col='black',data.title='The hunger games',
		                    	hlt.col.exist=T,hl.col.which=c(which.idx,which.idx2),hl.col.fill=c(rep('green',length(which.idx)),rep('red',length(which.idx2))),hl.col.alpha=c(rep(0.4,length(which.idx)),rep(0.4,,length(which.idx2))),
		                    	hlt.row.exist=F,hl.row.which=c(5),hl.row.fill=c('red'),hl.row.alpha=c(0.4)
                )
ggTableDrawer(debug.spec)
#######################################################################################################

l
za

Smaug the dragon appears to love his chopsticks. Can’t blame him for that.

 

Implementation 2 : Neither Rows or Columns
l

#######################################################################################################
# Example 2
#######################################################################################################
debug.spec <- ggTableSpec(columns.exist=F,columns.txt=colnames(debug.df),columns.font=rep('bold',5),columns.col='black',columns.fill='grey',columns.alpha=0.7,
		                      rows.exist=F,rows.txt=rownames(debug.df),rows.font=rep('bold',20),rows.col='black',rows.fill='grey',rows.alpha=0.7,
		                      data.obj=debug.df,data.col='black',data.title='The hunger games',
		                    	hlt.col.exist=T,hl.col.which=c(which.idx,which.idx2),hl.col.fill=c(rep('green',length(which.idx)),rep('red',length(which.idx2))),hl.col.alpha=c(rep(0.4,length(which.idx)),rep(0.4,,length(which.idx2))),
		                    	hlt.row.exist=F,hl.row.which=c(5),hl.row.fill=c('red'),hl.row.alpha=c(0.4)
                )
ggTableDrawer(debug.spec)
#######################################################################################################

l
zb

 

Implementation 3 : Highlight Columns instead
l

#######################################################################################################
# Example 3
#######################################################################################################
debug.spec <- ggTableSpec(columns.exist=T,columns.txt=colnames(debug.df),columns.font=rep('bold',5),columns.col='black',columns.fill='grey',columns.alpha=0.7,
		                      rows.exist=T,rows.txt=rownames(debug.df),rows.font=rep('bold',20),rows.col='black',rows.fill='grey',rows.alpha=0.7,
		                      data.obj=debug.df,data.col='black',data.title='The hunger games',
		                    	hlt.col.exist=F,hl.col.which=c(which.idx),hl.col.fill=c(rep('green',length(which.idx))),hl.col.alpha=c(rep(0.4,length(which.idx))),
		                    	hlt.row.exist=T,hl.row.which=c(1,2,3,4,5),hl.row.fill=c('red','blue','green','yellow','purple'),hl.row.alpha=rep(0.4,5)
                )
ggTableDrawer(debug.spec)
#######################################################################################################

l
zc

 

 

Implementation 4 : Highlight Row and Column
l

#######################################################################################################
# Example 4
#######################################################################################################
debug.spec <- ggTableSpec(columns.exist=T,columns.txt=colnames(debug.df),columns.font=rep('bold',5),columns.col='black',columns.fill='grey',columns.alpha=0.7,
		                      rows.exist=T,rows.txt=rownames(debug.df),rows.font=rep('bold',20),rows.col='black',rows.fill='grey',rows.alpha=0.7,
		                      data.obj=debug.df,data.col='black',data.title='The hunger games',
		                    	hlt.col.exist=T,hl.col.which=c(which.idx),hl.col.fill=rep('green',length(which.idx)),hl.col.alpha=c(rep(0.4,length(which.idx))),
		                    	hlt.row.exist=T,hl.row.which=c(5),hl.row.fill=c('red'),hl.row.alpha=c(0.4)
                )
ggTableDrawer(debug.spec)
#######################################################################################################

l
zd

 

So far the table drawer works fine for medium sized data frames. The following pdf file contains further implementations :

ze

 

 

By and large it seems to work fine. Good enough for a quick table here and there.

The previous post ended with the creation of a matrix that contains an unlisted version of the selection criteria for each fund & model combination. From this object let’s create three data frames which we then manipulate and use as data inputs to the ggplot function.

[6] Make a facet plot of Selection Criteria across non-problematic funds

Here we simply make a dodged barchart plot for both AIC and SBC criteria across the 9 remaining non-problematic funds.

#ggplot across all funds
df.1 <- data.frame(Fund=rep(subfund.names,n.models),AIC=as.numeric(temp[,1]),SBC=as.numeric(temp[,2]),Model=temp[,5],stringsAsFactors=T)
df.1<-gather(df.1,Selection.Criteria,values,-c(Fund,Model))
miny <- min(df.1[,'values'])
maxy <- max(df.1[,'values'])

all.plot <- ggplot(data = df.1)+
	 						geom_bar(aes(x=Model,y=values,fill=Selection.Criteria),stat = "identity",position='dodge')+
	 						scale_fill_brewer(palette='Set1')+
	 						theme(axis.text.x = element_text(angle=90, hjust=1),legend.position='bottom',legend.key.size=unit(0.2, "cm"),legend.key.width=unit(2,'cm'),legend.direction='horizontal')+
	 						labs(title='Comparison of Selection Information for Lognormal, Autoregressive, \nand Regime Switching Models across Funds')+
	 						facet_wrap(~Fund,shrink=T)+
	 						coord_cartesian(ylim=c(miny,maxy+1))

l
nnnn

Interpretation of the selection criteria is straightforward : the higher the value the better the model. Although model differences are not very significant within each fund,the values for the selection criteria do show some variation across funds (thankfully). Models seem to do particularly well for the Equity.Market.Neutral and Fixed.Income.Arbitrage cases.

To gain a better view of how individual models performed,let’s take a look at the results for a particular fund & selection criterion combination.

[7] Snap plot function for the chosen fund and selection criterion

The FundSnap() function provides [1] a dodged barchart plot of models ranked according to the selected criterion (AIC,SBC) in descending order as well as [2]  a tabulation of numeric values (Likelihood,degree of freedom,AIC,SBC) with maximum/minimum values highlighted.

########################################################################################
# Fund snap across models
########################################################################################

FundSnap <- function(fund,df.obj,order.by){

	e<-environment()

	tbl.df <- df.obj%>%
		        filter(Fund==fund)%>%
		        select(Model,DF,Likelihood,AIC,SBC)

  temp.sel <- paste('df.obj$',order.by,sep='')

	plot.df <- df.obj%>%
		         arrange(desc(eval(parse(text=temp.sel))))%>%
		         filter(Fund==fund)%>%
		         select(c(Fund,Model,eval(parse(text=order.by))))

#plot
	miny<-min(plot.df[,3])
	maxy <-max(plot.df[,3])

	snap.plot <- ggplot(data = plot.df,environment=e)+theme_tufte()+
	 					 			geom_bar(aes(x=reorder(Model,desc(eval(parse(text=order.by)))),y=eval(parse(text=order.by)),fill=(eval(parse(text=order.by)))),stat = "identity",position='dodge',alpha=0.4)+
	 	              scale_fill_continuous(low="lightgreen", high="darkgreen")+
		              coord_cartesian(ylim=c(miny-1,maxy+1))+
		              theme(legend.position='none',legend.key.size=unit(0.2, "cm"),legend.key.width=unit(2,'cm'),legend.direction='horizontal',axis.text.x = element_text(angle=90, hjust=1),panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
		              labs(title=paste('Snapshot for : ',fund,' & ',order.by,'\nacross models',sep=''))+
		              xlab('Models')+
		              ylab('Values')

#table
	tbl.df$Model <- as.character(tbl.df$Model)
	tbl.df$Likelihood <- round(tbl.df$Likelihood,2)
	tbl.df$AIC <- round(tbl.df$AIC,2)
	tbl.df$SBC <- round(tbl.df$SBC,2)

	parse.str <- paste('tbl.df$',order.by,sep='')
  row.max.idx <- which(eval(parse(text=parse.str))==max(eval(parse(text=parse.str))))
  row.min.idx <- which(eval(parse(text=parse.str))==min(eval(parse(text=parse.str))))

	debug.spec <- ggTableSpec(columns.exist=T,columns.txt=colnames(tbl.df),columns.font=rep('bold',ncol(tbl.df)),columns.col='blue',columns.fill='grey',columns.alpha=0.7,
		                      rows.exist=F,rows.txt=paste('row',1:10,sep=''),rows.font=rep('bold',10),rows.col='green',rows.fill='grey',rows.alpha=0.7,
		                      data.obj=tbl.df,
		                    	hlt.col.exist=T,hl.col.which=c(row.min.idx,row.max.idx),hl.col.fill=c('lightgreen','darkgreen'),hl.col.alpha=c(0.4,0.4),
		                    	hlt.row.exist=F,hl.row.which=row.idx,hl.row.fill='skyblue',hl.row.alpha=0.45
                )
  tbl.plot <- ggTableDrawer(debug.spec)

	layOut(
		     list(snap.plot,1,1),
		     list(tbl.plot,2,1)
	)
}
########################################################################################

ll

Implementation is simple once we have the proper data frame in place:

#ggplot for specific funds
df.2 <- data.frame(Fund=rep(subfund.names,n.models),DF=as.numeric(temp[,3]),Likelihood=as.numeric(temp[,4]),AIC=as.numeric(temp[,1]),SBC=as.numeric(temp[,2]),Model=temp[,5],stringsAsFactors=T)
FundSnap(fund='Emerging.Markets',df.obj=df.2,order.by='SBC')

l
hhhh

Above shows the case of the Emerging.Markets & SBC combination.The AR.GARCH model provides the maximal SBC value while the ILN.Reg3 model is least useful here.

[8] Percentage of funds for which a model has maximal AIC

The final plot is a bar chart plot with text annotations showing how often a particular model is the best performer across the remaining non problematic funds.

#Percentages
df.3 <- data.frame(Fund=rep(subfund.names,n.models),AIC=as.numeric(temp[,1]),Model=temp[,5],stringsAsFactors=T)
df.3 <- spread(df.3,Model,AIC)

col.idx <- apply(df.3,1,function(x) which(x==max(x[-1]),arr.ind=TRUE))
col.names <- colnames(df.3)[col.idx]

df.percentages <- df.3%>%
										mutate(Best.Model=col.names)%>%
                    group_by(Best.Model)%>%
                    summarise(count=n()/length(df.3)*100)

perc.plot <- ggplot(data = df.percentages)+
	 							geom_bar(aes(x=Best.Model,y=count,fill=Best.Model),stat = "identity",position='dodge')+
	 							scale_fill_brewer(palette='Set1')+
	 							theme(axis.text.x = element_blank(),legend.position='bottom',legend.key.size=unit(0.2, "cm"),legend.key.width=unit(2,'cm'),legend.direction='horizontal')+
	 							labs(title='Best performing models')+
                geom_text(aes(x=Best.Model,y=count/2,label=paste(round(count,2),'%',sep=''),fontface=rep('bold',4)),size=rel(4))+
	 							xlab('')+
	 							ylab('Percentage')

ll

gh

It appears that the AR.GARCH and ILN.Reg3 models provide the highest AIC values a third of the time.The simplest model,the independent log normal model fares relatively badly in comparison. Complicated models do seem to work better in our data.

Follow

Get every new post delivered to your Inbox.