This has really challenged my ability to debug R code.
I want to use ddply()
to apply the same functions to different columns that are sequentially named; eg. a, b, c. To do this I intend to repeatedly pass the column name as a string and use the eval(parse(text=ColName))
to allow the function to reference it. I grabbed this technique from another answer.
And this works well, until I put ddply()
inside another function. Here is the sample code:
# Required packages:
library(plyr)
myFunction <- function(x, y){
NewColName = "a"
z = ddply(x, y, summarize,
Ave = mean(eval(parse(text=NewColName)), na.rm=TRUE)
)
return(z)
}
a = c(1,2,3,4)
b = c(0,0,1,1)
c = c(5,6,7,8)
df = data.frame(a,b,c)
sv = c("b")
#This works.
ColName = "a"
ddply(df, sv, summarize,
Ave = mean(eval(parse(text=ColName)), na.rm=TRUE)
)
#This doesn't work
#Produces error: "Error in parse(text = NewColName) : object 'NewColName' not found"
myFunction(df,sv)
#Output in both cases should be
# b Ave
#1 0 1.5
#2 1 3.5
Any ideas? NewColName is even defined inside the function!
I thought the answer to this question, loops-to-create-new-variables-in-ddply, might help me but I've done enough head banging for today and it's time to raise my hand and ask for help.
Today's solution to this question is to make summarize
into here(summarize)
. e.g.
myFunction <- function(x, y){
NewColName = "a"
z = ddply(x, y, here(summarize),
Ave = mean(eval(parse(text=NewColName)), na.rm=TRUE)
)
return(z)
}
here(f)
, added to plyr in Dec 2012, captures the current context.