R error "sum not meaningful for factors"

user2245731 picture user2245731 · Aug 4, 2013 · Viewed 131.7k times · Source

I have a file called rRna_RDP_taxonomy_phylum with the following data :

364  "Firmicutes"            39.31
244  "Proteobacteria"        26.35
218  "Actinobacteria"        23.54
65   "Bacteroidetes"         7.02
22   "Fusobacteria"          2.38
6    "Thermotogae"           0.65
3     unclassified_Bacteria  0.32
2    "Spirochaetes"          0.22
1    "Tenericutes"           0.11
1     Cyanobacteria          0.11

And I'm using this code for creating a pie chart in R:

if(file.exists("rRna_RDP_taxonomy_phylum")){
    family <- read.table ("rRna_RDP_taxonomy_phylum", sep="\t")
    piedat <- rbind(family[1:7, ],
                as.data.frame(t(c(sum(family[8:nrow(family),1]),
                                "Others",
                                sum(family[8:nrow(family),3])))))
    png(file="../graph/RDP_phylum_low.png", width=600, height=550, res=75)
    pie(as.numeric(piedat$V3), labels=piedat$V3, clockwise=TRUE, col=graph_col, main="More representative Phyliums")
    legend("topright", legend=piedat$V2, cex=0.8, fill=graph_col)
    dev.off()
    png(file="../graph/RDP_phylm_high.png", width=1300, height=850, res=75)
    pie(as.numeric(piedat$V3), labels=piedat$V3, clockwise=TRUE, col=graph_col, main="More representative Phyliums")
    legend("topright", legend=piedat$V2, cex=0.8, fill=graph_col)
    dev.off()
}

I've been using this code for different datafiles and it works fine, but with the file presented adobe it crash returning the following message:

Error in Summary.factor(c(6L, 2L, 1L), na.rm = FALSE) : 
  sum not meaningful for factors
Calls: rbind -> as.data.frame -> t -> Summary.factor
Execution halted

I need to understand why it crash with this file and if there's any way to prevent this kind of errors.

Thanks!

Answer

Ricardo Saporta picture Ricardo Saporta · Aug 4, 2013

The error comes when you try to call sum(x) and x is a factor.

What that means is that one of your columns, though they look like numbers are actually factors (what you are seeing is the text representation)

simple fix, convert to numeric. However, it needs an intermeidate step of converting to character first. Use the following:

family[, 1] <- as.numeric(as.character( family[, 1] ))
family[, 3] <- as.numeric(as.character( family[, 3] ))

For a detailed explanation of why the intermediate as.character step is needed, take a look at this question: How to convert a factor to integer\numeric without loss of information?