-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathComparing_Sentiment_Analysis.Rmd
More file actions
786 lines (608 loc) · 21 KB
/
Comparing_Sentiment_Analysis.Rmd
File metadata and controls
786 lines (608 loc) · 21 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
---
title: "Comparing Sentiment Analysis Models in R"
author: "Alex Crest & Aybuke Atalay"
date: "`r Sys.Date()`"
output: html_document
editor_options:
markdown:
wrap: 72
---
# Introduction
Hello World! I hope you are all ready to get stuck in with sentiment.
(positive: 0.071, neutral: 0.857, negative: 0) Today, we are looking to
build our knowledge, and to (positive: 0, neutral: 1, negative: 0)
provide you with the tools to conduct sentiment analysis in your own
research. (positive: 0, neutral: 1, negative: 0)
This is the MD file for Comparing Sentiment Analysis in R. It is
designed to be used as part of a workshop, along with a presentation.
Please refer to the associated presentation for introduction and initial
guidance.
Before the session starts, make sure you have R 4.5.3 installed and
running!
```{r}
R.version.string
install.packages("installr")
library(installr)
#updateR()
```
# Schedule
Session start 14:05
14:05 - 14:10 1) Housekeeping
14:10 - 14:20 2) Introduction to Sentiment Analysis: What, Why, How?
14:20 - 15:50 3) A Comparative look at Three Examples (with a 10 minute
break)
15:50 - 16:00 4) Final Q and A.
## 1) Housekeeping and Set Up
### Aims
Understanding what sentiment analysis is and its utility Processing data
Understanding three packages: tidytext, syuzhet, quanteda Building
skills in R Summarising and comparing sentiment results with other
variables Saving data including to plots, .csv, and .json.
### Setting
The work will be completed between this markdown and your own local
environment on R.
Open RStudio
Go to File → New Project → Version Control → Git
Paste the repo URL from teams
Choose a folder
Click Create Project
We're ready to go!
## 2) Introduction to Sentiment Analysis: What, Why, How?
Focus on the presentation, nothing needed here just now!
## 3) Practical: A Comparative look at Three Examples
Before any text analysis, your data needs to be preprocessed. Today we
will be working with three datasets: A collection of book reviews
from Amazon (Courtesy of Hou et al. 2024). The Gettysberg Address,
courtesy of Abraham Lincoln. The inaugural addresses from US presidents
from 1900, already present in the quanteda package.
We will be using three different tools to assess these: sentimentr -
Amazon Reviews, lots of short texts syuzhet - the Gettysburg Address, a
single narrative quanteda - Inaugural Adresses, political speeches with
multiple subjects
Using the `Tidyverse` functions, we have a variety of very useful tools
to tidy the data, and some that are essential before sentiment analysis
can begin. From within this collection we are mostly going to use `Purrr`, `readr`, and `ggplot`.
We will be using the following packages for today
Some of these are already installed in Noteable so run the cell below
only if you are working locally and you have never used any of these
packages before
```{r}
{install.packages(c("tidyverse", "sentimentr", "jsonlite", "syuzhet", "quanteda","tokenizers", "textdata"))}
```
Run this if you are working on Noteable
```{r}
install.packages(c("sentimentr","syuzhet","textdata"))
```
```{r}
library(tidyverse)
library(sentimentr)
library(syuzhet)
library(jsonlite)
library(quanteda)
library(textdata)
library(tokenizers)
```
If you have any issues installing these, let us know now, and we can fix
the issue promptly.
## Accessing the data
Now we have our environment, we need to load the data! Now we need to
load this data into our workspace. Usually, we would have to find it,
either from an online repository, collect or build it ourselves.
However, today, the Amazon Reviews and The Gettsberg Address are already
loaded into the directory. We can also load the presidential addresses.
When you loaded the git, two datasets should have come with the rest of
the files. The Amazon dataset is called 'Data_Books_Lite.json' and
contains reviews from Amazon on different books.
```{r}
Amazon_Reviews <- fromJSON("Data_Books_Lite.json")
```
The Gettysburg address can be found here:
```{r}
Gettysburg_Address <- tibble(text = read_lines("Gettysburg.txt"))
```
The inaugural addresses are a dataset that already exists in quanteda!
We just need to select the relavent data.
```{r}
Inaugural_Address_Post_1900 <- corpus_subset(
data_corpus_inaugural, Year >= 1900)
```
Now convert the object to a dataframe with associated metadata
```{r}
Inaugural_Addresses <- data.frame(
Year = docvars(Inaugural_Address_Post_1900, "Year"),
President = docvars(Inaugural_Address_Post_1900, "President"),
Text = as.character(Inaugural_Address_Post_1900)
)
```
You will notice that each medium has different characteristics (long
speech, short phrases, punctuation etc.) that need to be considered for
the type of sentiment analysis.
It is best practice to also create a backup of your data in case any
step goes wrong.
``` {r}
Backup_Inaugural_Addresses <- Inaugural_Addresses
```
```{r}
Backup_Amazon_Reviews <- Amazon_Reviews
```
Create backups of the other two datasets and run these locally
To reload the backup, simply reverse it
```{r}
Amazon_Reviews <- Backup_Amazon_Reviews
```
There are three steps to sentiment Analysis: Text Preprocessing,
Applying Analysis, Interpreting the Results.
Thankfully, all of our datasets are structured as dataframes and our
preprocesing is going to be text specific. They are also clean, so this
is a relatively painless process. We need to transform raw text into a
more consistent and analysable form before applying sentiment analysis
We are going to go through steps 1 - 3 for each dataset on its own, and
then compare in more detail.
# Amazon Reviews
This is a big dataset, larger than a single researcher could reasonably
qualitatively review. Note that the methods today can be scaled up for
larger datasets still!
First, let's **inspect** the data to understand its composition
```{r}
head(Amazon_Reviews)
colnames(Amazon_Reviews)
sapply(Amazon_Reviews, class)
```
We will keep all the columns for now, but they provide useful variables
for comparison later on. The column we are concerned with for sentiment
is 'text', containing the raw text data.
In the following line we will **preprocess** the data: Reduce all text
to lowercase, and remove extra spaces. We are going to use 'mutate',
which creates a new column, from all the instructions after the pipe
'%\>%'
```{r}
Amazon_Reviews <- Amazon_Reviews %>%
mutate(
Text_Processed = text %>%
str_to_lower() %>%
str_squish()
)
```
Now we calculate the sentiment score column.
```{r}
review_sentiment <- sentiment_by(Amazon_Reviews$Text_Processed)
```
sentimentr calculates sentiment by sentence automatically, and then
combines these scores into an average for the document.
We need to add those averages back into our data:
```{r}
Amazon_Reviews <- Amazon_Reviews %>%
mutate(
sentimentr_score = review_sentiment$ave_sentiment,
sentiment_label = case_when(
sentimentr_score > 0 ~ "positive",
sentimentr_score < 0 ~ "negative",
TRUE ~ "neutral"
)
)
```
You have results for each row! This produces a sentiment score for each
review, along with a categorical label indicating whether the review is
positive, negative, or neutral.
To begin **interpretting** the data, find the mean across all reviews:
```{r}
sentiment_mean <- Amazon_Reviews %>%
summarise(
mean_sentiment = mean(sentimentr_score, na.rm = TRUE)
)
```
```{r}
print(sentiment_mean)
```
This indicates the overall polarity of the review corpus. Values above
zero suggest a generally positive set of reviews, while values below
zero suggest a more negative corpus.
We can now calculate the average sentiment for each book by grouping
reviews according to `meta_title`.
```{r}
book_sentiment <- Amazon_Reviews %>%
group_by(meta_title) %>%
summarise(
mean_net_sentiment = mean(sentimentr_score, na.rm = TRUE),
review_count = n()
) %>%
ungroup()
```
You can manually sift through per book. This dataset is already made of
precisely 10 reviews per book.
Now, we are going to investigate the extremities of our data: the books
with the highest and lowest sentiment
```{r}
top_3 <- book_sentiment %>%
arrange(desc(mean_net_sentiment)) %>%
slice_head(n = 3)
bottom_3 <- book_sentiment %>%
arrange(mean_net_sentiment) %>%
slice_head(n = 3)
plot_data <- bind_rows(top_3, bottom_3)
```
If you have worked in R before, you may have used ggplot or ggplot2 to
plot your data. Sentiment can be plotted much as any other variable, but
here we will guide you through best practices and appropriate plots for
the models we have used.
Below, we will create a bar chart of the sentiment scores
```{r}
ggplot(plot_data,
aes(y = reorder(str_c(word(meta_title, 1, 3), "..."),
mean_net_sentiment),
x = mean_net_sentiment)) +
geom_col() +
coord_flip() +
labs(
title = "Extreme Net Sentiment by Book",
x = "Mean Net Sentiment",
y = "Book"
)
```
Now **save and export** the graph
```{r}
ggsave("Extreme_Net_Sentiment_by_Book.png")
```
# The Gettysberg Address
This is a much simpler operation, as we have one row and one object that
we are going to analyse. This time we are going to devide the text up by
sentence, to track sentiment as it develops through Lincoln's words
```{r}
sentences <- get_sentences(Gettysburg_Address$text)
sentiment_values <- get_sentiment(sentences, method = "syuzhet")
sentiment_df <- data.frame(
sentence = sentences,
sentiment = sentiment_values,
index = 1:length(sentences)
)
```
Now the trajectory of sentiment through the text can be plotted
```{r}
ggplot(sentiment_df, aes(x = index, y = sentiment)) +
geom_line() +
geom_point() +
labs(
title = "Sentiment Trajectory of the Gettysburg Address",
x = "Sentence Number",
y = "Sentiment"
) +
theme_minimal()
ggsave("gettysburg_sentiment_plot.png")
```
It is so much simpler! More so, this can be combined with other analysis
and compared with other speeches! We can assess sentiment by theme, time
in the speech and other qualitative variables also.
#Inaugural Presidential Adrresses For political speeches with multiple
subjects, we will be using the quanteda package. This time, our analysis
is going to be more complex.
We need to convert the texts into a corpus first, so our package can
handle it
```{r}
corp <- corpus(Inaugural_Addresses, text_field = "Text")
```
Then tokenise, and remove extraneous data and create a document feature
matrix
```{r}
toks <- tokens(
corp,
remove_punct = TRUE,
remove_numbers = TRUE,
remove_symbols = TRUE
)
dfmat <- dfm(toks)
```
Count total words in each speech
```{r}
word_counts <- ntoken(toks)
```
Apply the built in LSD2015 dictionary to the data
```{r}
sent_dfm <- dfm_lookup(dfmat, dictionary = data_dictionary_LSD2015)
```
## Understanding polarity
Option 1. Raw Sentiment
```{r}
sentiment_results <- convert(sent_dfm, to = "data.frame") %>%
mutate(
Year = docvars(corp, "Year"),
President = docvars(corp, "President"),
net_sentiment = positive - negative
)
```
Option 2. Normalised
```{r}
sentiment_results <- convert(sent_dfm, to = "data.frame") %>%
mutate(
Year = docvars(corp, "Year"),
President = docvars(corp, "President"),
word_count = word_counts,
net_sentiment = positive - negative,
normalized_sentiment = net_sentiment / word_count
)
```
These scores have been **normalised** by dividing them by the total
tokens in each speech/document. This is a crucial step to providing
comparable data across a corpus. Otherwise, the presidents that spoke
more would simply have higher scores. this is itself a finding
```{r}
ggplot(sentiment_results,
aes(x = Year, y = normalized_sentiment)) +
geom_col() +
theme_minimal() +
labs(
title = "Normalised Net Sentiment in Post-1900 Inaugural Addresses",
x = "Year",
y = "Net Sentiment per Word"
)
```
# Comparison
We now have three different datasets, analysed in three slightly
different ways by virtue of the different packages we applied. Each has
different results. Now we are going to return to our Amazon reviews and
compare the results as they work on just one dataset.
## Run quanteda
Develop the Corpus
```{r}
corp_amazon <- corpus(Amazon_Reviews, text_field = "Text_Processed")
```
Great, now tokenize.
```{r}
toks_amazon <- tokens(
corp_amazon,
remove_punct = TRUE,
remove_numbers = TRUE,
remove_symbols = TRUE
)
dfmat_amazon <- dfm(toks_amazon)
```
Run the analysis, and combine with the original dataset
```{r}
sent_dfm_amazon <- dfm_lookup(dfmat_amazon, dictionary = data_dictionary_LSD2015)
sentiment_quanteda <- convert(sent_dfm_amazon, to = "data.frame") %>%
mutate(
net_sentiment_q = positive - negative,
normalised_sentiment_q = net_sentiment_q / ntoken(toks_amazon)
)
Amazon_Reviews <- bind_cols(
Amazon_Reviews,
sentiment_quanteda %>% select(net_sentiment_q, normalised_sentiment_q)
)
```
## Calculate the correlation between the two scores
```{r}
cor(
Amazon_Reviews$sentimentr_score,
Amazon_Reviews$normalised_sentiment_q,
use = "complete.obs"
)
```
What is the utility of doing this?
## Visualise them
```{r}
ggplot(Amazon_Reviews,
aes(x = sentimentr_score, y = normalised_sentiment_q)) +
geom_point(alpha = 0.4) +
geom_smooth(method = "lm") +
labs(
title = "Comparison of Sentiment Methods",
x = "Sentimentr Score",
y = "Quanteda Normalized Sentiment"
)
```
## Apply syuzhet to complete the triad
```{r}
Amazon_Reviews <- Amazon_Reviews %>%
mutate(
syuzhet_score = get_sentiment(Text_Processed, method = "syuzhet")
)
```
Visualise the results
```{r}
ggplot(Amazon_Reviews) +
geom_density(aes(x = sentimentr_score, linetype = "sentimentr")) +
geom_density(aes(x = normalised_sentiment_q, linetype = "quanteda")) +
geom_density(aes(x = syuzhet_score, linetype = "syuzhet")) +
coord_cartesian(xlim = c(-1, 1)) +
labs(
title = "Distribution of Sentiment Scores Across Methods",
x = "Sentiment Score",
y = "Density",
linetype = "Method"
) +
theme_minimal()
```
## Compare them all
```{r}
comparison_books <- Amazon_Reviews %>%
group_by(meta_title) %>%
summarise(
sentimentr_mean = mean(sentimentr_score, na.rm = TRUE),
quanteda_mean = mean(normalised_sentiment_q, na.rm = TRUE),
syuzhet_mean = mean(syuzhet_score, na.rm = TRUE),
review_count = n()
) %>%
ungroup()
```
You can inspect it with:
```{r}
print(comparison_books)
```
Or plot one method against another:
```{r}
ggplot(comparison_books, aes(x = sentimentr_mean, y = quanteda_mean)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Book-Level Sentiment: Sentimentr vs Quanteda",
x = "Mean Sentimentr Score",
y = "Mean Quanteda Score"
) +
theme_minimal()
```
And similarly for syuzhet:
```{r}
ggplot(comparison_books, aes(x = sentimentr_mean, y = syuzhet_mean)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Book-Level Sentiment: Sentimentr vs Syuzhet",
x = "Mean Sentimentr Score",
y = "Mean Syuzhet Score"
) +
theme_minimal()
```
### Compare rankings of books across methods
```{r}
top_3_sentimentr <- comparison_books %>%
arrange(desc(sentimentr_mean)) %>%
slice_head(n = 3)
bottom_3_sentimentr <- comparison_books %>%
arrange(sentimentr_mean) %>%
slice_head(n = 3)
```
```{r}
top_3_quanteda <- comparison_books %>%
arrange(desc(quanteda_mean)) %>%
slice_head(n = 3)
bottom_3_quanteda <- comparison_books %>%
arrange(quanteda_mean) %>%
slice_head(n = 3)
```
```{r}
top_3_syuzhet <- comparison_books %>%
arrange(desc(syuzhet_mean)) %>%
slice_head(n = 3)
bottom_3_syuzhet <- comparison_books %>%
arrange(syuzhet_mean) %>%
slice_head(n = 3)
```
```{r}
book_rankings <- comparison_books %>%
mutate(
rank_sentimentr = rank(-sentimentr_mean, ties.method = "first"),
rank_quanteda = rank(-quanteda_mean, ties.method = "first"),
rank_syuzhet = rank(-syuzhet_mean, ties.method = "first")
) %>%
select(meta_title, rank_sentimentr, rank_quanteda, rank_syuzhet)
print(book_rankings)
```
## Optional: We can also use sentiment analysis to model emotions in text
## If instructed, search for 'End of Session'
Tidytext is already installed in noteable so only run the below if you are running this locally otherwise just go directly to the next cell
```{r}
install.packages(tidytext)
```
```{r}
library(tidytext)
```
In tidytext, there is a sentiment analysis lexicon called NRC, a
dictionary of thousands of terms that have been tagged positive,
negative, and also given an emotional tag: anger fear joy sadness trust
disgust surprise anticipation
below is a method for processing said analysis
```{r}
NRC_Amazon_Reviews <- Amazon_Reviews %>%
mutate(Text_Processed = text %>% # creates a new column
str_to_lower() %>% # lowercase
# str_replace_all("[^a-z\\s]", "") %>% # Optional, remove punctuation/numbers. Some sentiment tools struggle with punctuation. Today, we will remove it. We can always repeat and omit this line
str_squish() # remove extra spaces and line breaks
)
```
We have created a new column, and preserved the raw data for the method
we are using, we need to tokenise, that is devide up the sentences into
lists of words
```{r}
NRC_Amazon_Reviews <- NRC_Amazon_Reviews %>%
mutate(tokens = tokenize_words(Text_Processed))
```
A note of caution, text preprocessing often includes removing stopwords,
but words such as 'no' 'yes' 'he' 'here' are meaning-laden and their
removal will affect sentiment scores. We will leave them in, for now.
If you have any issues, reload the backup. By now, you should have a
dataframe with an additional column of "Text_Processed".
```{r}
colnames(NRC_Amazon_Reviews)
```
Load the lexicon
```{r}
nrc <- get_sentiments("nrc")
```
Now create a reference list of NRC words grouped by sentiment category
```{r}
nrc_lookup <- split(nrc$word, nrc$sentiment)
```
Apply it, and assign scores to each review We are also going to nest the
results in each row in a new column called sentiment_scores, for
tidiness
```{r}
NRC_Amazon_Reviews <- NRC_Amazon_Reviews %>% # #start with reviews dataset that already contains token lists
mutate(
sentiment_scores = purrr::map(tokens, \(x) { # #for each review, work through its tokens
tibble::as_tibble_row(
sapply(nrc_lookup, \(words) sum(x %in% words)) # #count how many tokens match each NRC sentiment category
)
})
)
```
To make sense of this, unnest the sentiment scores.
```{r}
NRC_Amazon_Reviews <- NRC_Amazon_Reviews %>%
unnest(sentiment_scores) %>%
rename(
nrc_anger = anger,
nrc_anticipation = anticipation,
nrc_disgust = disgust,
nrc_fear = fear,
nrc_joy = joy,
nrc_sadness = sadness,
nrc_surprise = surprise,
nrc_trust = trust,
nrc_positive = positive,
nrc_negative = negative
)
```
For ease of processing, we are going to convert the negative sentiment
to negative numbers
```{r}
NRC_Amazon_Reviews <- NRC_Amazon_Reviews %>%
mutate(
nrc_negative = -nrc_negative
)
```
You can now aggregate these emotions according to different variables
## End of Session
This workshop applied three sentiment analysis approaches to different
textual datasets. `sentimentr` provided context-sensitive polarity
scoring for Amazon reviews, `syuzhet` enabled sentence-level tracking
across the Gettysburg Address, and `quanteda` supported scalable
dictionary-based analysis of inaugural addresses. Comparing these
methods showed that sentiment results vary depending on the package and
level of analysis, highlighting the importance of choosing a method
appropriate to the research question.
We only have one session today, but if you would like more tailored
feedback or assistance, please book a data surgery with us.
<https://www.cdcs.ed.ac.uk/training/data-surgery>
# Further Reading
For more on the utility of sentiment analysis and their critiques, read
the following papers.
Devika, M. D., Sunitha, C., & Ganesh, A. (2016). Sentiment analysis: a
comparative study on different approaches. Procedia Computer Science,
87, 44-49.
Gonçalves, P., Araújo, M., Benevenuto, F., & Cha, M. (2013, October).
Comparing and combining sentiment analysis methods. In Proceedings of
the first ACM conference on Online social networks (pp. 27-38).
Shiha, M., & Ayvaz, S. (2017). The effects of emoji in sentiment
analysis. Int. J. Comput. Electr. Eng.(IJCEE.), 9(1), 360-369.
Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on
sentiment analysis methods, applications, and challenges. Artificial
intelligence review, 55(7), 5731-5780.
# A Tutorial for Advanced Sentiment in Python
This is a useful tutorial for more advanced learning:
<https://github.com/DCS-training/SentimentAnalysis>
# References:
Hou, Y., Li, J., He, Z., Yan, A., Chen, X., & McAuley, J. (2024).
Bridging language and items for retrieval and recommendation. arXiv
Preprint arXiv:2403.03952.
Saif M. Mohammad, S. M., & Peter D. Turney, P. D. (2013).Crowdsourcing a
Word–Emotion Association Lexicon.Computational Intelligence, 29(3),
436–465