Data Science Asked on February 27, 2021
I have used Vader, a sentiment analysis tool for social media, on a database of movie reviews. These two confusion matrices differ in the vader.py algorithm, as the first one is from nltk:
The second one is deriving from Vader’s original code on github and includes fixes to negation words, etc.
I was wondering how could I properly compare the two, as I’m not really able to read them. It seems there is not a big difference between them and I don’t understand what could be the sources of the errors here.
First, about interpreting these confusion matrices: the sum of every row is 1, which implies that every value is a conditional probability p( predicted label | true label )
, i.e. the probability of a given true label to be a particular predicted label. Example: the top left cell in both matrices is 0.01, which means that when the true label is 5 the probability that the system predicts label 1 is 1%.
The two confusion matrices show the results of the predictions by two different systems. This means that in general there is not one which is correct and the other which makes errors, just two different ways to predict labels which might lead to different types of errors.
In order to quantitatively summarize and compare the performance of the two systems, confusion matrices are too complex. Typically one would use an appropriate evaluation measure, for instance micro or macro f-score (classification evaluation) or Mean Absolute Error (regression evaluation).
Correct answer by Erwan on February 27, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP