tm_map has parallel::mclapply error in R 3.0.1 on Mac

Dominik picture Dominik · Aug 17, 2013 · Viewed 15.3k times · Source

I am using R 3.0.1 on Platform: x86_64-apple-darwin10.8.0 (64-bit)

I am trying to use tm_map from the tm library. But when I execute the this code

library(tm)
data('crude')
tm_map(crude, stemDocument)

I get this error:

Warning message:
In parallel::mclapply(x, FUN, ...) :
  all scheduled cores encountered errors in user code

Does anyone know a solution for this?

Answer

nograpes picture nograpes · Aug 17, 2013

I suspect you don't have the SnowballC package installed, which seems to be required. tm_map is supposed to run stemDocument on all the documents using mclapply. Try just running the stemDocument function on one document, so you can extract the error:

stemDocument(crude[[1]])

For me, I got an error:

Error in loadNamespace(name) : there is no package called ‘SnowballC’

So I just went ahead and installed SnowballC and it worked. Clearly, SnowballC should be a dependency.