Data Science Asked on July 28, 2021
I am working on a project where I would like to be able to specifically analyze the level of subjectivity in a given text phrase using machine learning. Essentially, I would like to be able to classify a given text based on whether it is written from a subjective or objective point of view. Ideally, I would like to be able to classify on the sentence level (i.e. assign each sentence a classification of being written from a subjective or objective perspective). This will most likely be applied to short form news articles and/or tweets.
My question is: what sort of NLP methodologies are best for this use case?
I have explored a few papers on the topic and it seems that this falls under the umbrella of sentiment analysis. Many papers report that machine learning methods such as SVM and neural network architectures perform well on the problem, given features such as a lexicon containing mappings of words to sentiment and POS tags. I have briefly tested out the TextBlob library’s subjectivity score with some mixed results.
As I am new to NLP I am looking for suggestions as to which methodology to apply. Specifically, I would like to ask:
Is the text blob library well suited to this task (are there any other libraries you can recommend)? I have looked into the implementation and it seems to be computing subjectivity based on the presence of modifiers and the number of words belonging to certain POS tags.
What sort of features would you recommend utilizing to train a model for this task? As I mentioned, most papers seem to use POS tag counts and the sum of the polarity scores of the tokens in the text to determine subjectivity, but I wonder if there are other methods that experienced practitioners might think to apply
Are there any architectures that are particularly well suited to this task? The papers I have read report the best results for subjectivity classification using SVM and perceptron architectures but I am interested in exploring other architectures that may be able to analyze the text differently. I have looked briefly into sequence-to-sequence and attention models but based on what I have read they seem to be suited for a different range of tasks than this one – though intuitively I suspect that an attention model may be able to extract some latent representation of the text which could assist in subjectivity classification.
I would really appreciate any help to set me off in the right direction. I hope that my questions are clear. For reference I’ve included links to some of the papers I have been looking at below.
https://www.sciencedirect.com/science/article/pii/S0950705114002068
https://www.researchgate.net/publication/322512000_Subjectivity_Detection_in_Nuclear_Energy_Tweets
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP