TransWikia.com

Usage of KL divergence to improve BOW model

Data Science Asked by Balocre on June 2, 2021

For a university project I chose to do sentiment analysis on a google playstore reviews dataset. I have obtained decent results classifying the data using BOW model and an ADALINE classifier.

I want to improve my model by incorporating bigrams relevant to the topic (Negative or Postive) in my features set, and I have found this paper which uses KL divergence to measure relevance of unigrams/bigrams relative to a topic.

Only I have trouble understanding what C refers to in the (2.2) equation, does it refer to the unique words associated with topic C, the set of documents on a topic, the words in a document?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP