Computing F-measure for clustering

mahesh cs picture mahesh cs · Oct 4, 2012 · Viewed 13.4k times · Source

Can anyone help me to calculate F-measure collectively ? I know how to calculate recall and precision, but don't know for a given algorithm how to calculate one F-measure value.

As an exemple, suppose my algorithm creates m clusters, but I know there are n clusters for the same data (as created by another benchmark algorithm).

I found one pdf but it is not useful since the collective value I got is greater than 1. Reference of pdf is F Measure explained. Specifically I have read some research paper, in which the author compares two algorithms on the basis of F-measure, they got collectively values between 0 and 1. if you read the pdf mentioned above carefully, the formula is F(C,K) = ∑ | ci | / N * max {F(ci,kj)}
where ci is reference cluster & kj is cluster created by other algorithm, here i is running from 1 to n & j is running from 1 to m.Let say |c1|=218 here as per pdf N=m*n let say m=12 and n=10, and we got max F(c1,kj) for j=2. Definitely F(c1,k2) is between 0 and 1. but the resultant value calculated by above formula we will get value above 1.

Answer

Has QUIT--Anony-Mousse picture Has QUIT--Anony-Mousse · Oct 4, 2012

The term f-measure itself is underspecified. It's the harmonic mean, usually of precision and recall. Actually you should even say F1-score if you mean the unweighted version, because you can put different weight on the two input values. But without saying which two values are averaged (not in the sense of the arithmetic mean!) this doesn't say much.

https://en.wikipedia.org/wiki/F1_score

Note that the values must be in the 0-1 value range. Otherwise, you have an error earlier on.

In cluster analysis, the common approach is to apply the F1-Measure to the precision and recall of pairs, often referred to as "pair counting f-measure". But you could compute the same mean on other values, too.

Pair-counting has the nice property that it doesn't directly compare clusters, so the result is well defined when one result has m cluster, the other has n clusters. However, pair counting needs strict partitions. When elements are not clustered or assigned to more than one cluster, the pair-counting measures can easily go out of the range 0-1.

Discusses some of these metrics (including Rand index and such) and gives a simple explanation of the "pair counting F-measure".