Data Science Asked by Samir Ahmane on February 11, 2021
I have two dataset from two different texts representing lexical density as a proportion based on a corpus. Both datasets are represented in the images below. Now, let’s suppose I want to know which text has more uncommon vocabulary. How should I proceed? What statistics should I use? Should it be a t-students test or Wilcoxon signed-rank test? I’m lost on this one, and I don’t wanna apply inference blindly. I am using the python library wordfreq to get word frequencies data.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP