I need to calculate BLEU score for identifying whether two sentences are similar or not.I have read some articles which are mostly about BLEU score for Measuring Machine translation accuracy.But i'm in need of a BLEU score to find out similarity between sentences in a same language[English].(i.e)(Both the sentences are in English).Thanks in anticipation.
For sentence level comparisons, use smoothed BLEU
The standard BLEU score used for machine translation evaluation (BLEU:4) is only really meaningful at the corpus level, since any sentence that does not have at least one 4-gram match will be given a score of 0.
This happens because, at its core, BLEU is really just the geometric mean of n-gram precisions that is scaled by a brevity penalty to prevent very short sentences with some matching material from being given inappropriately high scores. Since the geometric mean is calculated by multiplying together all the terms to be included in the mean, having a zero for any of the n-gram counts results in the entire score being zero.
If you want to apply BLEU to individual sentences, you're better off using smoothed BLEU (Lin and Och 2004 - see sec. 4), whereby you add 1 to each of the n-gram counts before you calculate the n-gram precisions. This will prevent any of the n-gram precisions from being zero, and thus will result in non-zero values even when there are not any 4-gram matches.
Java Implementation
You'll find a Java implementation of both BLEU and smooth BLEU in the Stanford machine translation package Phrasal.
Alternatives
As Andreas already mentioned, you might want to use an alternative scoring metric such as Levenstein's string edit distance. However, one problem with using the traditional Levenstein string edit distance to compare sentences is that it isn't explicitly aware of word boundaries.
Other alternatives include: