Top "Text-processing" questions

Mechanizing the creation or manipulation of electronic text.

BLEU score implementation for sentence similarity detection

I need to calculate BLEU score for identifying whether two sentences are similar or not.I have read some articles …

java algorithm nlp text-processing machine-translation
How to get Git log with short stat in one line?

Following command outputs following lines of text on console git log --pretty=format:"%h;%ai;%s" --shortstat ed6e0ab;2014…

git shell text-processing git-log text-manipulation
Which function should I use to read unstructured text file into R?

This is my first ever question here and I'm new to R, trying to figure out my first step in …

r text-processing file-read readlines
Python: How to loop through blocks of lines

How to go through blocks of lines separated by an empty line? The file looks like the following: ID: 1 Name: …

python text-processing
How can I sum values in column based on the value in another column?

I have a text file which is: ABC 50 DEF 70 XYZ 20 DEF 100 MNP 60 ABC 30 I want an output which sums up …

scripting text-processing
Java text classification problem

I have a set of Books objects, classs Book is defined as following : Class Book{ String title; ArrayList<tags&…

java machine-learning nlp text-processing classification
Running a macro till the end of text file in Emacs

I have a text file with some sample content as shown here: Sno = 1p Sno = 2p Sno = 3p What i …

emacs macros text-processing
Extract text between two strings repeatedly using sed or awk?

I have a file called 'plainlinks' that looks like this: 13080. ftp://ftp3.ncdc.noaa.gov/pub/data/noaa/999999-94092-2012.…

linux sed awk grep text-processing
User Warning: Your stop_words may be inconsistent with your preprocessing

I am following this document clustering tutorial. As an input I give a txt file which can be downloaded here. …

vectorization text-processing tf-idf stop-words stemming