Related questions
Python: tf-idf-cosine: to find document similarity
I was following a tutorial which was available at Part 1 & Part 2. Unfortunately the author didn't have the time for the final section which involved using cosine similarity to actually find the distance between two documents. I followed the examples …
Python's NLTK vs. related Java Libraries?
I've used LingPipe, Stanford's NER, RiTa and various sentence similarity libraries for my previous Java projects that focused on text (pre)processing (indexing, xml tagging, topic detection, etc.) of large amounts of English text (around 10,000 documents summing to > 1gb …
re.sub erroring with "Expected string or bytes-like object"
I have read multiple posts regarding this error, but I still can't figure it out. When I try to loop through my function:
def fix_Plan(location):
letters_only = re.sub("[^a-zA-Z]", # Search for all non-letters
" ", # Replace all non-letters with …