How To Calculate Sentiment Scores - A Layman’s Guide
A sentiment score helps in measuring the emotions reflected in a piece of text. Quantifying the sentiment in customer reviews or feedback in terms of positive and negative polarity helps us learn how customers feel. Brands use this information for business intelligence and see patterns in data across different functions of the business directly related to a customer to make data-backed growth decisions. We have created this easy-to-understand layman’s guide showing how to calculate sentiment scores so that brands like you can have a broad view of what goes on under the hood of a sentiment analysis engine.
What is Customer Sentiment Analysis?
Customer sentiment analysis is the processing of customer reviews and feedback for emotion mining to know exactly what the customer thinks about a brand or a particular product or service. Data for this kind of sentiment analysis can be gained from several places like voice of the customer sources like surveys, call centre logs, video interviews, user-generated videos, product reviews, and the like.
AI-based sentiment analysis platforms use many machine learning tasks like natural language processing (NLP) and named entity recognition (NER) to give an in-depth analysis of customer sentiments. NLP processes the text (for videos, audio is first transcribed into text) by part-of-speech tagging, lemmatization, calculating prior polarity, negations, and semantic clustering. After this, sentiment scores are assigned to each of the extracted topics, with scores ranging from -1 (true negative) to 1 (true positive) to the text. A neutral score means no sentiment or emotion was expressed. This is then presented on a sentiment analysis dashboard.
Advancements in Calculating Sentiment Score
User-generated content is qualitative in nature i.e. it can be in the form of comments, videos, images, audio, and so on. This customer feedback now needs to be quantified, as in, we need to assign numeric values to it in order to calculate the sentiment score for the content. In social media listening analytics, metrics can be defined in the form of number of likes, number of user-generated content received by a brand (volume), the general sentiment in the content- whether it’s positive or negative; and of course the number of shares the content received (virality).
As data scientists explore new ways to get sentiment scores that are more precise and accurate, there have been vast advancements in calculating sentiment scores. This is a very important step ahead for marketers and professionals in many fields such as finance and banking. This has affected the way we calculate social sentiment scores as well.
- Traditional method
The traditional way of assessing the effectiveness of a brand’s marketing strategy is to assess it by quantifying consumer actions such as the number of times a video is viewed, or the number of shares, likes, etc. It is assumed that these numerical metrics about ad-spend and the number of shares or posts, when compared to each other, indicate the effectiveness of a marketing campaign.
Limitations - Unfortunately, this is a flawed thought process. Simply because, if a comment or a review video is posted among engagement groups on a social channel or if it a paid content, the platform’s algorithm will consider the content as high quality and thus promote it and so increase its visibility. In such a case, the social platform’s algorithm causes the video to have very high click rates simply because it was posted in a “group” or because it was paid for by a brand.
If as a marketer, you take this method of sentiment scoring for your content as positive and invest further in similar campaigns, you are walking into completely unpredictable territory and a risky investment of time and resources. As you may have realized, by this logic, irrelevant and inaccurate content can also be inferred as popular and relevant by the algorithm just because of its click-through metric. This is detrimental to deriving an accurate social sentiment score.
Click here to learn about other challenges in sentiment analysis.
- Content-based sentiment score
To be able to truly understand the effectiveness of an advert campaign or user-generated content for your brand on social platforms, you need to calculate its sentiment score through social media sentiment analysis. If it is surveys or emails, you will need text analysis, and if it’s videos, then a video content analysis platform will fulfill this requirement. But in any case, there is a vital need to assign a numerical emotion score to the content itself.
Let us consider an example for a social sentiment score
Below are two reviews in the hospitality industry by guests of the same hotel.
Customer A writes, “The hotel was great, the staff was friendly and room prices were quite reasonable. I surely recommend them! ★★★★★ (5/5 stars)
Customer B writes, “My room was cold and we had to wait for hours for the hotel staff to adjust the thermostat, even though the hotel seemed empty. When we tried to call the reception to enquire, they seemed impatient and rude. ★☆☆☆☆ (1/5 stars)
Content analysis to figure how to calculate sentiment scores is done through multiple approaches. Let’s see them in detail.
1. Star Rating-based sentiment score
The most simple way to assign a positive or negative score to the content is by star rating. But in this case, when you take millions of reviews, you cannot really know why a brand received a particular rating. This might not be very useful to a discerning customer.
2. Text-based sentiment score
The simplest analysis of text for emotion mining will be to train an AI-based natural language processing platform to assign scores to the content based on keywords. In the case of the 1st review, it will be because of key words such as, “great, friendly, surely recommend”. And as for the 2nd review, the algorithm will mark it as negative based on the words “impatient” and “rude”. Since the ratio of positive and negative reviews is 1:1, the aggregate sentiment score of the hotel will be neutral as they cancel each other out (50% positive−50% negative=0).
However, this approach too, does not calculate the context of positive and negative sentiment to the reason behind why a brand received a particular score. There is no semantic consideration either.
3. Aspect-based sentiment score
For a more detailed and accurate analysis of sentiment to arrive at a precise sentiment score, the content itself needs to be broken down into smaller pieces for scrutiny by using the aspect-based sentiment analysis. For example, in the 1st review, customer A talks about the staff and the price, while in the 2nd review, customer B is talking about the rooms and the staff. He does not mention the price at all. In both these cases, the algorithm will consider the aspects of price, room, and staff, while calculating the aggregated sentiment score.
Here too, machine learning platforms with a sentiment analysis API need to have a semantic understanding of the content, not just lexical, to calculate the aggregate social sentiment score. This is important because in many cases, a post needs to be understood in the context it was written in. That’s why companies, many a time, ask reviewers to post photos or ask them to assign star ratings. All these different methods when conjoined, offer a more efficient way of calculating sentiment scores.
How To Calculate Sentiment Scores
As a part of sentiment analysis steps, there are many methods that data scientists use to calculate sentiment scores as , but before that, they need to clean up the text, or what is called data pre-processing, so that it can be analyzed by the engine.
Data pre-processing
Customer reviews and comments usually comprise a sentence or two, which for obvious reasons have punctuation, articles (which we call stop words), numbers, or even special characteristics like smilies or ampersands. These need to be removed so that the comment only has what’s necessary for the algorithm to understand. Therefore, we will embark on stemming and tokenizing the entire sentence or paragraph, as well as eliminating numbers.
Let’s use the same example as customer A in the section above.
“My room was cold and we had to wait for hours for the hotel staff to adjust the thermostat, even though the hotel seemed empty. When we tried to call the reception to enquire, they seemed impatient and rude.”
- Tokenization - separating into words
“The stay was nice but my room was cold and we had to wait for hours for the hotel staff to adjust the thermostat , even though the hotel seemed empty . When we tried to call the reception to enquire , they seemed impatient and rude .”
- Text normalization - removing non-text items
“The stay was nice but my room was cold and we had to wait for hours for the hotel staff to adjust the thermostat , even though the hotel seemed empty . When we tried to call the reception to enquire , they seemed impatient and rude . ”
- Word stemming - reducing a word to its root
The stay was nice but my room was cold and we had to wait for hour for the hotel staff to adjust the thermostat even though the hotel seem empty When we tried to call the reception to enquire they seemed impatient and rude
- Stop-word removal - eliminating superfluous words
The stay was nice My room cold and we had to wait for hour for the hotel staff to adjust the thermostat even though the hotel seem empty When we tried to call the reception to enquire they seemed impatient and rude
Thus, after all the above tasks have been carried out, the pre-processed customer comment will be:
nice room cold wait hour hotel staff reception impatient rude
Now, we count and map the total number of positive and negative word occurrences along with intensity of emotion.
Positive - Nice - 1(+ 0.07)
Negative - impatient(- 0.5), rude(- 0.7) - 2
Calculating the Sentiment Score
As mentioned earlier, there are many approaches that can be used for calculating sentiment scores. Let’s look at them.
1. Word count method
Based on the opinion lexicon we used in the above example, a simple way to calculate the sentiment score would be to reduce the negative from the positive occurrences as below.
1 - 2 = -1
Thus, the sentiment score of the above example is -1. This is a simplistic way of calculating the sentiment score, however, for longer, more complex sentences, we would use a different method.
2. Deducing sentiment score with the length of the sentence
We subtract the number of positive words from the negative words and divide the result by the total number of words in the review sentence.
1-2/42 = -0.0238095
Although this system is better for longer reviews, the resulting score can be very minute and so difficult to differentiate when done at scale. Alternatively, many data scientists choose to multiply scores by a singular digit to get larger scores that make comparison easier.
3. Ratio of +ve and -ve words counts
In this method, we divide the total number of positive words by the total number of negative words and add it by one.
1/ 2+1 = 0.33333
In this method, the ratio normalizes the total length of the text a bit. This is good because the longer the review is, the more the count of positive and negative scores is. In this method, a score around 1 is set as neutral. Data scientists consider this method of calculating sentiment scores to be more balanced.
Click here for in-depth study of Review sentiment analysis.
How Do Sentiment Scores Help Brands?
There’s a lot of theory in this article and a lot of detail as to how sentiment scores are calculated. But for us to truly comprehend how these mathematical algorithms help businesses in everyday life, we need to look at some real-world examples of how sentiment scores have helped different industries.
See how sentiment scores have helped different companies.
We hope this non-technical guide has been useful to you and your team as you embark on your mission to scout the market for a highly efficient, accurate, and precise sentiment analysis platform.