Training own model in opennlp

user1482228 picture user1482228 · Jun 26, 2012 · Viewed 18k times · Source

I am finding it difficult to create my own model openNLP. Can any one tell me, how to own model. How the training shouls be done.

What should be the input and where the output model file will get stored.

Answer

andrew.butkus picture andrew.butkus · Oct 28, 2013

https://opennlp.apache.org/docs/1.5.3/manual/opennlp.html

This website is very useful, shows both in code, and using the OpenNLP application to train models for all different types, like entity extraction and part of speech etc.

I could give you some code examples in here, but the page is very clear to use.

Theory-wise:

Essentially you create a file which lists the stuff you want to train

eg.

Sport [whitespace] this is a page about football, rugby and stuff

Politics [whitespace] this is a page about tony blair being prime minister.

The format is described on the page above (each model expects a different format). once you have created this file, you run it through either the API or the opennlp application (via command line), and it generates a .bin file. Once you have this .bin file, you can load it into a model, and start using it (as per the api in the above website).