I'm completely new in R and tm package, so please excuse my stupid question ;-) How can I show the text of a plain text corpus in R tm package?
I've loaded a corpus with 323 plain text files in a corpus:
src <- DirSource("Korpora/technologie")
corpus <- Corpus(src)
But when I call the corpus with:
corpus[[1]]
I always get some output like this instead of the corpus text itself:
<<PlainTextDocument>>
Metadata: 7
Content: chars: 144
Content: chars: 141
Content: chars: 224
Content: chars: 75
Content: chars: 105
How can I show the text of the corpus?
Thanks!
UPDATE Reproducible sample: I've tried it with the built-in sample text:
> data("crude")
> crude
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 20
> crude[1]
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 1
> crude[[1]]
<<PlainTextDocument>>
Metadata: 15
Content: chars: 527
How can I print the text of the documents?
UPDATE 2: Session Info:
> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tm_0.6-1 NLP_0.1-7
loaded via a namespace (and not attached):
[1] parallel_3.1.3 slam_0.1-32 tools_3.1.3
This works in mine, to print the content text, with latest version of tm,
corpus[[1]]$content
Note: More or less as suggested by Ricky in the previous comment. Sorry, I wanted to write comment, only my rep is only 25 (need min. of 50 rep to comment).