Sentiment Analysis and Emotion Detection: commonalities and differences
At a first glance, especially for those who do not have a scientific background, Sentiment Analysis (SA) and Emotion Detection (ED) might seem the same concept. But they are not synonymous, so what is the difference between sentiment analysis and emotion detection?University of Pisa, technical partner of the Me-Mind project, offers us an in-depth article and practical tips to avoid misunderstanding about this technical issue, which can be performed by organisations aimed at collecting insights by distributing questionnaires to their audience. For this reason, the article includes the activities performed at Internet Festival 2021 to better understand the audience and thus make data-driven decisions.
What is sentiment analysis? And what is emotion detection?
Sentiment Analysis aims at obtaining meaningful information and semantics from a text, by using natural processing techniques and determining the writer’s attitude, whether positive, negative, or neutral. Hence, sentiment analysis aims to determine texts’ polarity and classify them as positive or negative. However, classes can also include neutrals/objectives and can be further disaggregated in different scales, i.e. 5-point scale: strongly disagree, disagree, neutral, agree, or strongly agree.Similarly to SA, Emotion Detection – also known as Emotion Recognition (ER) – aims at identifying the emotions expressed in texts, e.g. joy, anger, and sadness. Together with identifying the so-called primary psychological conditions, i.e. happiness, sadness, and anger, ED also allows for 6-scale and 8-scale depending on both psychology theories and emotion models. Emotional models and theories are broadly classified into two categories, namely dimensional and categorical. Dimensional emotion models represent emotions based on valence, arousal, and dominance features, where valence means polarity, arousal relates with excitation of a feeling, and dominance refers to restriction over emotion.
How can sentiment analysis and emotion detection be applied to questionnaires
Sentiment analysis, which recognises polarity in texts, can be used to assess whether the audience and stakeholders have a negative, positive, or neutral attitude toward the event and specific aspects, i.e. impact. Moreover, emotion detection, which determines an individual’s emotional state, allows a deeper study of specific emotions aroused by events.In general, to do SA and ED tasks on questionnaires two main strategies can be employed.
The first one foresees that questionnaires include so-called “open questions”, which allow obtaining (long) texts to apply SA and ED techniques. This strategy enables users to express their opinions and feelings completely free. On the other hand, this analysis underlines several issues that must be considered to obtain meaningful results and enhance the extracted knowledge: the most crucial concerns relate to the number and quality of answers provided by users. For instance, part of the answers could have poor meaning – or meaningless – despite most of the audience providing them, while a small percentage completely skips open questions. Then, even including mandatory questions, the desired result of meaningful answers might not be achieved because of the audience’ motivation: responders are often not willing to spend so much time/attention in completing questionnaires and the risk of receiving hasty and meaningless answers is real.
The second strategy that can be adopted is to provide prefixed answers to ensure their meaningfulness. Unlike the first strategy, the second one does not allow completely free texts, possibly affecting results. An example is defining the impact and opinion on the event and activities by choosing terms from a prefixed set, e.g. adjectives tagged with a polarity score and a primary emotion. The provided answers will be necessarily full of meaning despite provided terms could not perfectly represent real targets’ opinions. Allowing for an “escape” solution, e.g. providing the possibility to answer with free text, could help to include all opinions but with the risk of introducing noise. We cannot ensure the meaningfulness of the given term, nor that it meets the requirements, i.e. being an adjective.
Both strategies can deal with SA and ED with minor changes. A wide range of methods can be employed when dealing with free long texts to perform sentiment and emotion-related analysis. These are typically grouped into three types: lexicon-based, machine learning-based, and deep learning-based, and carry out different benefits and drawbacks. In addition to models provided by the literature, several free and ready-to-use libraries are available today.
What we have done so far at Internet Festival
During the preliminary analysis of Me-Mind, we employed FEEL-IT: an open-source Python library that provides a trained model to infer sentiments and emotions of Italian texts. The model allows obtaining SA-related tags in the range positive, negative, and labels for four basic emotions (i.e. anger, fear, joy, and sadness).The library provides a result even in the case of empty and non-Italian fields (e.g. English) despite SA and ED models being pre-trained only for the Italian language. Thus, an evaluation phase or controls (rules) could be necessary to improve data reliability. For example, empty texts may be excluded a priori, or their results may not be considered. The same is also valid for English texts. However, to avoid a loss of information, a) another library that supports English can be used; b) human annotators can further check the results of FEEL-IT (in case of small datasets); c) the results of multiple libraries, including FEEL-IT, can be cross-checked.
What we can experiment at Internet Festival in future
A set of prefixed questions may be added to open ones to bypass the limits observed during the analysis. However, a controlled environment requires to pre-identify a set of terms, which has to be tagged with polarity scores or emotions.These sets can contain terms, becoming properly so-called bag-of-words or sentences. In this context, the most crucial concern relates to the choice of the dictionaries to be adopted. In particular, concerns relate to lexicon availability, scales, and language coverage regarding the number of languages and linguistic register updates. The most popular lexicons for dealing with SA include SentiWordNet*, SenticNet, and Valence Aware Dictionary and Sentiment Reasoner (VADER). Further, WordNet-Affect and NRC lexicon are categorical lexicons providing emotion labels**.
The example below shows how a multiple choice question can be structured for SA and ED tasks. Each response includes an underlined term that refers to a specific feeling (positive vs negative) or emotion (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust) in a lexicon (e.g. SenticNet and NRC).
[* The dataset is licensed under a Creative Commons Attribution 3.0 Unported License]
[** In particular, each NRC tag is composed of one emotion and two sentiments.]. However, these traditional lexicons ignore the intensity of emotions]
A practical example
Q: Are you satisfied with the impact of the Internet Festival on your business? Examples of SA-related answers:- I am generally satisfied/happy (satisfaction: 1 [SenticNet], negative 0, positive 1 [NRC])
- I feel dissatisfied (dissatisfied: -1 [SenticNet], negative 1, positive 0 [NRC])
- I think the event was boring (boring: -1 [SenticNet], negative 1, positive 0 [NRC])
- I was pleasantly amazed (amaze: 1 [SenticNet], negative 0, positive 1 [NRC])
- I’m generally unhappy (unhappy: anger 1, anticipation 0, disgust 1, fear 0, joy 0, sadness 1, surprise 0, trust 0 [NRC])
- I’m generally happy (happy: anger 0, anticipation 1, disgust 0, fear 0, joy 1, sadness 0, surprise 0, trust 1 [NRC])
- The events amazed me (amaze: anger 0, anticipation 0, disgust 0, fear 0, joy 0, sadness 0, surprise 1, trust 0 [NRC]).
- I think the events were engaging (engaging: anger 0, anticipation 0, disgust 0, fear 0, joy 1, sadness 0, surprise 0, trust 1 [NRC])
- I found the events very dull (dull: anger 0, anticipation 0, disgust 0, fear 0, joy 0, sadness 1, surprise 0, trust 0, [NRC])
- I’m generally bored by the event (boring: anger 0, anticipation 0, disgust 0, fear 0, joy 0, sadness 0, surprise 0, trust 0 [NRC])
- I am disappointed because I think I have wasted time. (disappointed: anger 1, anticipation 0, disgust 1, fear 0, joy 0, sadness 1, surprise 0, trust 0 [NRC])
Lessons learned
As well-known, both sentiment analysis and emotion detection can provide valuable suggestions to understand and interpret the audience’s feedback and stakeholders. Moreover, opinions and feelings can be used to improve and enhance the event’s offer and its activities.The study performed on the audience during Internet Festival 2021 through questionnaires underlines that various valuable information can be obtained by applying sentiment analysis and emotion detection techniques. Moreover, we can ensure that the opinions provided are spontaneous since:
- the answers are not mandatory
- and the texts are free.
_____________________________________
References
- Nandwani, Pansy, and Rupali Verma. “A review on sentiment analysis and emotion detection from text.” Social network analysis and mining vol. 11,1 (2021): 81. doi:10.1007/s13278-021-00776-6
- Ye, Qiang, Ziqiong Zhang, and Rob Law. “Sentiment classification of online reviews to travel destinations by supervised machine learning approaches.” Expert systems with applications 36.3 (2009): 6527-6535.
- Gräbner D, Zanker M, Fliedl G, Fuchs M, et al. (2012) Classification of customer reviews based on sentiment analysis. In: ENTER, Citeseer, pp 460–470
- Prabowo, Rudy, and Mike Thelwall. “Sentiment analysis: A combined approach.” Journal of Informetrics 3.2 (2009): 143-157.
- Onyenwe, Ikechukwu, et al. “The impact of political party/candidate on the election results from a sentiment analysis perspective using# AnambraDecides2017 tweets.” Social Network Analysis and Mining 10.1 (2020): 1-17.
- Bakker I, Van Der Voordt T, Vink P, De Boon J. Pleasure, arousal, dominance: Mehrabian and Russell revisited. Curr Psychol. 2014;33(3):405–421. doi: 10.1007/s12144-014-9219-4.
- Bianchi, Federico, Debora Nozza, and Dirk Hovy. “Feel-it: Emotion and sentiment classification for the Italian language.” Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 2021.
- Strapparava C, Valitutti A, et al. (2004) Wordnet affect: an affective extension of wordnet. In: Lrec, Citeseer, vol 4, pp 1083–1086
- Esuli A, Sebastiani F. Sentiwordnet: a publicly available lexical resource for opinion mining. LREC, Citeseer. 2006;6:417–422.
- Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol 8
- Mohammad SM, Turney PD. Crowdsourcing a word-emotion association lexicon. Comput Intell. 2013;29(3):436–465. doi: 10.1111/j.1467-8640.2012.00460.x
- Cambria, E., Speer, R., Havasi, C., Hussain, A. 2010. SenticNet: A publicly available semantic resource for opinion mining. In Proceedings of AAAI CSK, pp. 14-18, Arlington.