Fastest way to take the weighted sum of the columns of a matrix in R

Anirban picture Anirban · Nov 8, 2012 · Viewed 8.4k times · Source

I need the weighted sum of each column of a matrix.

data <- matrix(1:2e7,1e7,2) # warning large number, will eat up >100 megs of memory
weights <- 1:1e7/1e5
system.time(colSums(data*weights))
system.time(apply(data,2,function(x) sum(x*weights)))
all.equal(colSums(data*weights), apply(data,2,function(x) sum(x*weights)))

Typically colSums(data*weights) is faster than the apply call.

I do this operation often (on a large matrix). Hence looking for advice on the most efficient implementation. Ideally, would have been great if we could pass weights to colSums (or rowSums).

Thanks, appreciate any insights!

Answer

mnel picture mnel · Nov 8, 2012

colSums and * are both internal or primitive functions and will be much faster than the apply approach

Another approach you could try is to use some basic matrix algebra as you are looking for

 weights %*% data

The matrix multiplication method does not appear to be faster but it will avoid creating a temporary object the size of data

system.time({.y <- colSums(data * weights)})
##  user  system elapsed 
##  0.12    0.03    0.16 


system.time({.x <- weights %*% data})
##   user  system elapsed 
##   0.20    0.05    0.25