I have a ton of short stories about 500 words long and I want to categorize them into one of, let's say, 20 categories:
I can hand-classify a bunch of them, but I want to implement machine learning to guess the categories eventually. What's the best way to approach this? Is there a standard approach to machine learning I should be using? I don't think a decision tree would work well since it's text data...I'm completely new in this field.
Any help would be appreciated, thanks!
A naive Bayes will most probably work for you. The method is like this:
Training:
Decision: