Mean by factor by level

Bartek Taciak picture Bartek Taciak · Apr 30, 2014 · Viewed 75.8k times · Source

Maybe this is simple but I can't find answer on web. I have problem with mean calculation by factors by level. My data looks typicaly:

factor, value
a,1
a,2
b,1
b,1
b,1
c,1

I want to get vector A contains mean only for level "a" If I type A on consol I want to get 1.5 And this method for calculating mean, must use factors.

Thank you in advance for help.

Answer

JPC picture JPC · Apr 30, 2014

take a look at tapply, which lets you break up a vector according to a factor(s) and apply a function to each subset

> dat<-data.frame(factor=sample(c("a","b","c"), 10, T), value=rnorm(10))
> r1<-with(dat, tapply(value, factor, mean))
> r1
         a          b          c
 0.3877001 -0.4079463 -1.0837449
> r1[["a"]]
[1] 0.3877001

You can access your results using r1[["a"]] etc.

Alternatively, one of the popular R packages (plyr) has very nice ways of doing this.

> library(plyr)
> r2<-ddply(dat, .(factor), summarize, mean=mean(value))
> r2
  factor       mean
1      a  0.3877001
2      b -0.4079463
3      c -1.0837449
> subset(r2,factor=="a",select="mean")
       mean
1 0.3877001

You can also use dlply instead (which takes a dataframe and returns a list instead)

> dlply(dat, .(factor), summarize, mean=mean(value))$a
       mean
1 0.3877001