Data sets for emotion detection in text

ekka picture ekka · Jun 8, 2015 · Viewed 22.3k times · Source

I'm implementing a system that could detect the human emotion in text. Are there any manually annotated data sets available for supervised learning and testing?

Here are some interesting datasets: https://dataturks.com/projects/trending

Answer

buechel picture buechel · Jan 18, 2016

The field of textual emotion detection is still very new and the literature is fragmented in many different journals of different fields. Its really hard to get a good look on whats out there.

Note that there a several emotion theories psychology. Hence there a different ways of modeling/representing emotions in computing. Most of the times "emotion" refers to a phenomena such as anger, fear or joy. Other theories state that all emotions can be represented in a multi-dimensional space (so there is an infinite number of them).

Here are a some (publicly available) data sets I know of (updated):

  1. EmoBank. 10k sentences annotated with Valence, Arousal and Dominance values (disclosure: I am one of the authors). https://github.com/JULIELab/EmoBank

  2. The "Emotion Intensity in Tweets" data set from the WASSA 2017 shared task. http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html

  3. The Valence and Arousal Facebook Posts by Preotiuc-Pietro and others: http://wwbp.org/downloads/public_data/dataset-fb-valence-arousal-anon.csv

  4. The Affect data by Cecilia Ovesdotter Alm: http://people.rc.rit.edu/~coagla/affectdata/index.html

  5. The Emotion in Text data set by CrowdFlower https://www.crowdflower.com/wp-content/uploads/2016/07/text_emotion.csv

  6. ISEAR: http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615

  7. Test Corpus of SemEval 2007 (Task on Affective Text) http://web.eecs.umich.edu/~mihalcea/downloads.html

  8. A reannotation of the SemEval Stance data with emotions: http://www.ims.uni-stuttgart.de/data/ssec

If you want to go deeper into the topic, here are some surveys I recommend (disclosure: I authored the first one).

  1. Buechel, S., & Hahn, U. (2016). Emotion Analysis as a Regression Problem — Dimensional Models and Their Implications on Emotion Representation and Metrical Evaluation. In ECAI 2016.22nd European Conference on Artificial Intelligence (pp. 1114–1122). The Hague, Netherlands (available: http://ebooks.iospress.nl/volumearticle/44864).

  2. Canales, L., & Martínez-Barco, P. (n.d.). Emotion Detection from text: A Survey. Processing in the 5th Information Systems Research Working Days (JISIC 2014), 37 (available: http://www.aclweb.org/anthology/W14-6905).