Python Arabic NLP

Bassem picture Bassem · Sep 12, 2011 · Viewed 14.7k times · Source

I'm in the process of assessing the capabilities of the NLTK in processing Arabic text in a research to analyze and extract sentiments.

Question is as follows:

  1. Is the NTLK capable of handling and allows the analysis of Arabic text?
  2. Is python capable of manipulating\tokenizing Arabic text?
  3. Will I be able to parse and store Arabic text using Python?

If python and NTLK aren't the tools for this job, what tools would you recommend (if existent)?

Thank you.


Based on research:

  1. NTLK is only capable of stemming Arabic text: Link
  2. Python is capable of handling Arabic text since it supports UTF-8 unicode: Link
  3. Parsing and Lemmatization of Arabic text can be done using: SNLPG (The Stanford Natural Language Processing Group) Statistical Parser: Link
