I'm looking for a very simple implementation in Java of a user-based collaborative filtering. I would like to evaluate the precision and recall of this CF with the movielens dataset. I've seen that the performance (F1) should be around 20 to 30% (with Pearson similarity, and KNN).
Does this simple framework exist with the evaluation for precision and recall code?
Apache Mahout does everything you mention here. It is Java-based, and supports user-based collaborative filtering (among others) with GenericUserBasedRecommender
. It is a k-nearest-neighbor algorithm, into which you can plug similarity implementations like PearsonCorrelationSimilarity
and others.
Look at the org.apache.mahout.cf.taste
package and subpackages. In the .impl.eval
subpackage find GenericRecommenderIRStatsEvaluator
. This will run a test that reports precision, recall and F1.
Finally, there are already some working examples based on GroupLens
in mahout-examples
.