I need to use Wordnet in a java-based app. I want to:
search synsets
find similarity/relatedness between synsets
My app uses RDF graphs and I know there are SPARQL endpoints with Wordnet, but I guess it's better to have a local copy of the dataset, as it's not too big.
I've found the following jars:
What would you recommend for my app?
Is it possible to use a Perl library from a java app via some bindings?
Thanks! Mulone
I use JAWS for normal wordnet stuff because it's easy to use. For similarity metrics, though, I use the library located here. You'll also need to download this folder, containing pre-processed WordNet and corpus data, for it to work. The code can be used like this, assuming you placed that folder in another called "lib" in your project folder:
JWS ws = new JWS("./lib", "3.0");
Resnik res = ws.getResnik();
TreeMap<String, Double> scores1 = res.res(word1, word2, partOfSpeech);
for(Entry<String, Double> e: scores1.entrySet())
System.out.println(e.getKey() + "\t" + e.getValue());
System.out.println("\nhighest score\t=\t" + res.max(word1, word2, partOfSpeech) + "\n\n\n");
This will print something like the following, showing the similarity score between each possible combination of synsets represented by the words to be compared:
hobby#n#1,gardening#n#1 2.6043996588901104
hobby#n#2,gardening#n#1 -0.0
hobby#n#3,gardening#n#1 -0.0
highest score = 2.6043996588901104
There are also methods that allow you to specify which sense of either/both words: res(String word1, int senseNum1, String word2, partOfSpeech)
, etc. Unfortunately, the source documentation is not JavaDoc, so you'll need to inspect it manually. The source can be downloaded here.
The available algorithms are:
JWSRandom(ws.getDictionary(), true, 16.0);//random number for baseline
Resnik res = ws.getResnik();
LeacockAndChodorowlch = ws.getLeacockAndChodorow();
AdaptedLesk adLesk = ws.getAdaptedLesk();
AdaptedLeskTanimoto alt = ws.getAdaptedLeskTanimoto();
AdaptedLeskTanimotoNoHyponyms altnh = ws.getAdaptedLeskTanimotoNoHyponyms();
HirstAndStOnge hso = ws.getHirstAndStOnge();
JiangAndConrath jcn = ws.getJiangAndConrath();
Lin lin = ws.getLin();
WuAndPalmer wup = ws.getWuAndPalmer();
Also, it requires you to have the jar file for MIT's JWI