I want to compute the similarity (distance) between two vectors:
v1 <- c(1, 0.5, 0, 0.1)
v2 <- c(0.7, 1, 0.2, 0.1)
I just want to know if a package is available for calculating different well-known similarity (distance) measures in R? For example, "Resnik", "Lin", "Rel", "Jiang",...
The implementation of these method is not hard, but I really think it must be defined in some packages in R.
After some googling I found a package "GOSemSim", which contains most measures, but it's specific to the biomedical application and I can't use them for computing the similarity between two vectors.
"proxy" is a general library for distance and similarity measures. The following methods are supported:
"Jaccard" "Kulczynski1" "Kulczynski2" "Mountford" "Fager" "Russel" "simple matching" "Hamman" "Faith"
"Tanimoto" "Dice" "Phi" "Stiles" "Michael" "Mozley" "Yule" "Yule2" "Ochiai"
"Simpson" "Braun-Blanquet" "cosine" "eJaccard" "fJaccard" "correlation" "Chi-squared" "Phi-squared" "Tschuprow"
"Cramer" "Pearson" "Gower" "Euclidean" "Mahalanobis" "Bhjattacharyya" "Manhattan" "supremum" "Minkowski"
"Canberra" "Wave" "divergence" "Kullback" "Bray" "Soergel" "Levenshtein" "Podani" "Chord"
"Geodesic" "Whittaker" "Hellinger"
Check the following example:
x <- c(1,2,3,4,5)
y <- c(4,5,6,7,8)
l <- list(x, y)
simil(l, method="cosine")
The output is a similarity matrix between the elements of "l":
1
2 0.978232
The only problem I have is that for some methods (such as: "Jaccard"), the following error is occurred:
simil(l, method="Jaccard")
Error in n - d : 'n' is missing