tapply function complains that args are unequal length yet they appear to match

gregbowman picture gregbowman · Apr 25, 2014 · Viewed 13.6k times · Source

Here is the failing call, error messages and some displays to show the lengths in question:

it <- tapply(molten, c(molten$Activity, molten$Subject, molten$variable), mean)
# Error in tapply(molten, c(molten$Activity, molten$Subject, molten$variable),  : 
#  arguments must have same length

length(molten$Activity)
# [1] 679734

length(molten$Subject)
# [1] 679734

length(molten$variable)
# [1] 679734

dim(molten)
# [1] 679734      4

str(molten)
# 'data.frame': 679734 obs. of  4 variables:
#  $ Activity: Factor w/ 6 levels "WALKING","WALKING_UPSTAIRS",..: 5 5 5 5 5 5 5 5 5 5 ...
#  $ Subject : Factor w/ 30 levels "1","2","3","4",..: 2 2 2 2 2 2 2 2 2 2 ...
#  $ variable: Factor w/ 66 levels "tBodyAcc-mean()-X",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ value   : num  0.257 0.286 0.275 0.27 0.275 ...

Answer

Henrik picture Henrik · Apr 25, 2014

If you have a look at ?tapply you will see that X should be "an atomic object, typically a vector". You feed tapply with a data frame ("molten"), which is not an atomic object. See is.atomic, and try is.atomic(molten). Furthermore, your grouping variables should be provided as a list (see INDEX argument).

Something like this works:

tapply(X = warpbreaks$breaks,  INDEX = list(warpbreaks$wool, warpbreaks$tension), mean)
#          L        M        H
# A 44.55556 24.00000 24.55556
# B 28.22222 28.77778 18.77778