I need the weighted sum of each column of a matrix.
data <- matrix(1:2e7,1e7,2) # warning large number, will eat up >100 megs of memory
weights <- 1:1e7/1e5
system.time(colSums(data*weights))
system.time(apply(data,2,function(x) sum(x*weights)))
all.equal(colSums(data*weights), apply(data,2,function(x) sum(x*weights)))
Typically colSums(data*weights)
is faster than the apply call.
I do this operation often (on a large matrix). Hence looking for advice on the most efficient implementation. Ideally, would have been great if we could pass weights to colSums (or rowSums).
Thanks, appreciate any insights!
colSums
and *
are both internal or primitive functions and will be much faster than the apply
approach
Another approach you could try is to use some basic matrix algebra as you are looking for
weights %*% data
The matrix multiplication method does not appear to be faster but it will avoid creating a temporary object the size of data
system.time({.y <- colSums(data * weights)})
## user system elapsed
## 0.12 0.03 0.16
system.time({.x <- weights %*% data})
## user system elapsed
## 0.20 0.05 0.25