There are few intellectual exercises more demanding (or more burdensome…) than studying variation in language. Most modern English dictionaries estimate the language to contain no fewer than 170,000 words, a figure that continues to evolve as some terms fall into obsolescence while others are continually introduced (often to the dismay of those who find Generation Alpha brain rot slang less than inspiring…). Matters become even more complicated when we consult a thesaurus and recognize that nearly every word possesses multiple synonyms, each differing in appropriateness depending on meaning, tense, register, and broader syntactic context.
For the moment, we will set aside many of these complexities until we turn to vectorized word embeddings in high-dimensional spaces. For now, let us assume that words – irrespective of tense or usage – possess localized and singular meanings. Consider the following sentence:
\[ \text{I loved the new restaurant; the food was delicious and the service was fantastic} \] After reducing complexity using the same approach as our previous class, we are left with love new restaurant food delicious service fantastic.While we might have been able to infer from just reading the original sentence that the implied rhetorical sentiment (i.e., the attitudinal or affective quality of text) was certainly positive (as opposed to neutral, negative, or some alternative classification), reducing the complexity we are left with two words (glad & enjoy) that still reflect a positive sentiment, while movie appears to not relay a perceivable sentiment. In its most basic form, we can simply assert that because there are (2) positive words and (0) negative words, the sentence is positive!
Now let’s consider another sentence:
\[ \text{The movie was a disappointment; the plot was confusing and the acting was poor} \] Much like the first sentence, we could probably infer the negative intent, but removing complexity again further reinforces our belief (movie disappointment plot confuse act poor). The terms disappointment, confuse, and poor offer a preponderance of evidence that the sentence’s implied sentiment is negative
But what about a sentence like this:
\[ \text{The concert had amazing visuals and energy, but the sound quality was inconsistent} \] Which produces concert amaze visual energy sound quality inconsistent. Suddenly our ability to consider sentiment at face value is a bit more challenging – surely we could make the argument that the author certainly enjoyed the concert, but it is not a full-throated endorsement due to the inconsistent sound quality. Assessing sentiment at the word level (concert amaze visual energy sound quality inconsistent) now requires greater diligence – as we could reasonably assert that although the sentence contains at least one positive term (amaze) and one negative term (inconsistent) whether and how we assess meaning to the remaining terms with debatable classifications (energy & quality), could have an important impact on our ability to classify this sentence correctly, as well as any others where we employ that approach.
This approach is at the core of dictionary methods – perhaps the best introductory approach to understanding how we can derive meaning (e.g., rhetorical sentiment) from text. In short, dictionary methods rely on a pre-defined lexicon where each word (or n-gram) is assigned to one or more categories, such as positive or negative sentiment. The strength of this approach lies in having a large, carefully curated vocabulary whose classifications are both theoretically grounded and defensable (i.e., terms that you define as + or - are truly such.) By counting the frequency of words in each category, we can estimate the presence and intensity of particular meanings or sentiments across sentences, passages, and documents.
Before I demonstrate a handful of the existing dictionary methods in
R, I want to emphasize (4) key points about dictionary
classifiers (which I already touch on briefly in the previous note):
Dictionary classifiers, while rather simple and intuitive, can nevertheless provide robust classification power. Especially among documents retaining authentic discursive properties – i.e., authors refrain from literary tools like misdirection, sarcasm, or other inauthentic speech, a dictionary strategy provides a lot of upside for considerably little technical effort.
However, much of that validity relies on the ability of researchers to construct dense lexicons. Much like anything in statistical analyses, small-N problems easily permeate dictionary approaches when the size of variety of the lexicon is insufficient to capture the variance of the documents themselves.
Moreover, while a dense dictionary is certainly important, paramount to such is constructing categorical classifications (e.g., positive or negative) that are both authentic and defensible. For example, terms like happy and joyous are (almost always) associated with positive sentiments, while terms like depressing or hateful are certainly more negative. Misidentifying these terms will surely impact your ability to make valid classifications.
Finally, and in a similar vein, dictionary methods are sensibly plagued by the same concern as the bag of words, more generally. In particular, words have associated meaning based on tense and conditional usage. For instance, happy is certainly positive – but what if the sentence actually reads I was not happy at all. Suddenly that very negative sentence may be misclassified as a positive sentence based solely on the classification label for happy. Much of these concerns (as with other areas concerning the bag of words) can be improved given more robust methods, including the incorporation of inverses (not) and n-grams.
R)Below I provide a handful of sandboxed dictionary methods – all of which will rely on the same collection of strings. In particular, I will focus just on three methods: Bing (Word \(\rightarrow\) Class), Afinn (Word \(\rightarrow\) Scalar), and Sentimentr (Sentiment = Polarity x Valence).
strings <- c('This decision is excellent, fair, and clearly the right outcome',
'The opinion is good and persuasive, even if it is not perfect',
'The ruling has some good points but also several serious flaws',
'The decision is bad and poorly reasoned',
'This opinion is terrible, deeply unfair, and completely wrong')
strings <- tibble(
doc_id = seq_along(strings),
text = strings)
strings_tokens <- strings %>% # Convert to tibble
tidytext::unnest_tokens(word, text) # Convert to Unnested Tokens
head(strings_tokens)
## # A tibble: 6 × 2
## doc_id word
## <int> <chr>
## 1 1 this
## 2 1 decision
## 3 1 is
## 4 1 excellent
## 5 1 fair
## 6 1 and
The Bing Dictionary is a sentiment lexicon widely used for binary sentiment classification. It assigns words to one of two categories (positive or negative) and contains roughly 6,800 sentiment-labeled terms (approx. 2,000 positive and 4,800 negative)
bing_dictionary <- tidytext::get_sentiments("bing")
bing_dictionary %>%
group_by(sentiment) %>%
slice_sample(n = 3) %>%
ungroup() %>%
arrange(sentiment) # Sample BING
## # A tibble: 6 × 2
## word sentiment
## <chr> <chr>
## 1 disapprove negative
## 2 forbidden negative
## 3 egotistical negative
## 4 complements positive
## 5 galore positive
## 6 appreciate positive
strings_tokens %>%
inner_join(tidytext::get_sentiments(lexicon = 'bing'), by = 'word') %>%
count(doc_id, sentiment) %>%
left_join(strings, by = 'doc_id') %>%
select(text, sentiment, n)
## # A tibble: 7 × 3
## text sentiment n
## <chr> <chr> <int>
## 1 This decision is excellent, fair, and clearly the right outco… positive 4
## 2 The opinion is good and persuasive, even if it is not perfect positive 2
## 3 The ruling has some good points but also several serious flaws negative 1
## 4 The ruling has some good points but also several serious flaws positive 1
## 5 The decision is bad and poorly reasoned negative 2
## 6 The decision is bad and poorly reasoned positive 1
## 7 This opinion is terrible, deeply unfair, and completely wrong negative 2
The AFINN lexicon is designed to capture both the direction and intensity of sentiment. Unlike binary lexicons such as Bing, AFINN assigns each word a numeric sentiment score, typically ranging from –5 (strongly negative) to +5 (strongly positive). The dictionary contains roughly 2,500–3,400 terms.
afinn_dictionary <- tidytext::get_sentiments("afinn") # Afinn Dictionary
afinn_dictionary %>%
group_by(value) %>%
slice_sample(n = 1) %>%
ungroup() %>%
arrange(value) %>%
print(n = 11) # Sample Words (Value -5 to 5)
## # A tibble: 11 × 2
## word value
## <chr> <dbl>
## 1 bastards -5
## 2 pissed -4
## 3 catastrophe -3
## 4 touting -2
## 5 overweight -1
## 6 some kind 0
## 7 attracted 1
## 8 inspires 2
## 9 gallant 3
## 10 masterpiece 4
## 11 superb 5
strings_tokens %>%
inner_join(tidytext::get_sentiments(lexicon = 'afinn'), by = 'word') %>%
count(doc_id, value) %>%
left_join(strings, by = 'doc_id') %>%
select(text, value, n)
## # A tibble: 8 × 3
## text value n
## <chr> <dbl> <int>
## 1 This decision is excellent, fair, and clearly the right outcome 1 1
## 2 This decision is excellent, fair, and clearly the right outcome 2 1
## 3 This decision is excellent, fair, and clearly the right outcome 3 1
## 4 The opinion is good and persuasive, even if it is not perfect 3 2
## 5 The ruling has some good points but also several serious flaws 3 1
## 6 The decision is bad and poorly reasoned -3 1
## 7 This opinion is terrible, deeply unfair, and completely wrong -3 1
## 8 This opinion is terrible, deeply unfair, and completely wrong -2 2
SentimentR improves on simple dictionary classifiers by explicitly modeling context and valence shifters. Rather than relying solely on raw word counts, sentimentr computes sentiment at the sentence level, accounting for how nearby words modify sentiment expression. At its core, it uses a BING-style polarity lexicon (Where words have a singular classification) combined with rule-based adjustments for negators (e.g., not, never, etc.), amplifiers (e.g., very, extremely, etc.), and adversarial conjunctions (e.g., but). The result is the base polarity (BING-style score) times a valence multiplier that adjusts for sentence hueristics (ex: good = 1x, not good = -1x, barely good = 0.5x, etc.) – where negation doesn’t invariably flip the sign, but rather it will scale the intensity.
strings %>%
mutate(sentiment = sentimentr::sentiment_by(text)$ave_sentiment)
## # A tibble: 5 × 3
## doc_id text sentiment
## <int> <chr> <dbl>
## 1 1 This decision is excellent, fair, and clearly the right outc… 0.885
## 2 2 The opinion is good and persuasive, even if it is not perfect 0
## 3 3 The ruling has some good points but also several serious fla… -0.460
## 4 4 The decision is bad and poorly reasoned -0.340
## 5 5 This opinion is terrible, deeply unfair, and completely wrong -0.667