Using the tidytext package, I want to transform my tibble into a one-token-per-document-per-row. I transformed the text column of my tibble from factor to character but I still get the same error.
text_df <- tibble(line = 1:3069, text = text)
My tibble looks like this, with a column as character:
# A tibble: 3,069 x 2
line text$text
<int> <chr>
However when I try to apply unnest_tokens:
text_df %>%
unnest_tokens(word, text$text)
I always get the same error:
Error in check_input(x) : Input must be a character vector of any length or a list of character vectors, each of which has a length of 1.
What is the issue in my code?
PS: I've looked at different posts on the topic but no luck.
Thank you
At least part of the problem is the variable name containing a "$". What your are effectively doing in your code is trying to get the element "text" from the object "text", which is likely the function graphics::text and not subsetable.
Change the name of "text$text" or wrap it in backticks:
text_df %>%
unnest_tokens(word, `text$text`)
In general you should avoid using special characters in variable names, because it only leads to errors like this one.
If your problem persists, please provide a minimal reproducible example: How to make a great R reproducible example