Free Tagged Corpus for Named Entity Recognition

DantheMan picture DantheMan · Jul 25, 2010 · Viewed 10.4k times · Source

I am looking for a free tagged corpus for a system to train on to for Named Entity Recognition. Most of the ones I find (like the New York Times one) are expensive and not open. Can anyone help?

Answer

Tom Morris picture Tom Morris · Jul 12, 2012

There's a list of corpora at http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html

The CoNLL 2003 corpus, which is on that list, is free and is available from http://www.cnts.ua.ac.be/conll2003/ner/ (annotations) and NIST (text).