Top "Text-processing" questions

Mechanizing the creation or manipulation of electronic text.

How to find text files not containing text on Linux?

How do I find files not containing some text on Linux? Basically I'm looking for the inverse of the following …

linux find text-processing
Reading text values into matlab variables from ASCII files

Consider the following file var1 var2 variable3 1 2 3 11 22 33 I would like to load the numbers into a matrix, and the column …

matlab text file-io text-files text-processing
Expanding English language contractions in Python

The English language has a couple of contractions. For instance: you've -> you have he's -> he is …

python nlp text-processing
Algorithms to detect phrases and keywords from text

I have around 100 megabytes of text, without any markup, divided to approximately 10,000 entries. I would like to automatically generate a …

algorithm nlp text-processing
How does uʍop-ǝpᴉsdn text work?

Here's a website I found that will produce upside down versions of any English text. how does it work? does …

unicode text-processing
summarize text or simplify text

Is there any library, preferably in python but at least open source, that can summarize and or simplify natural-language text?

python nlp text-processing
Is there still any reason to learn AWK?

I am constantly learning new tools, even old fashioned ones, because I like to use the right solution for the …

awk text-processing
Python: Best Way to remove duplicate character from string

How can I remove duplicate characters from a string using Python? For example, let's say I have a string: foo = "…

python string text-processing
Using SQL to determine word count stats of a text field

I've recently been working on some database search functionality and wanted to get some information like the average words per …

mysql sql text-processing word-count
NLTK for Named Entity Recognition

I am trying to use NLTK toolkit to get extract place, date and time from text messages. I just installed …

machine-learning nlp nltk text-processing named-entity-recognition