Is/are there existing C++ NLP API(s) out there? The closest thing I have found is CLucene
, a port of Lucene
. However, it seems a bit obsolete and the documentation is far from complete.
Ideally, this/these API(s) would permit tokenization, stemming and PoS tagging.
Freeling is written in C++ too, although most people just use their binaries to run the tools: http://devel.cpl.upc.edu/freeling/downloads?order=time&desc=1
Try something like DyNet, it's a generic neural net framework but most of its processes are focusing on NLP because the maintainers are creators of the NLP community.
Or perhaps Marian-NMT, it was designed for sequence-to-sequence model machine translation but potentially many NLP tasks can be structured as a sequence-to-sequence task.
Maybe you can try Ellogon http://www.ellogon.org/ , they have GUI support and also C/C++ API for NLP too.