Data Science Asked by One_Curious_User on February 24, 2021
Let’s say I have a dataframe where some of the columns have lists of strings as values. I would like to use ML Algorithms on this dataframe.
In this case, I can:
So I ask:
I think a good starting point is what you have mentioned. For every element in a list, create a feature for that element. If that element is present in the list for a data point, then the element is denoted as a 1
in your feature vector. If it is not present in the list, then it is denoted by 0
. This is very much a bag-of-words type way to create features. You could limit the number of features by only taking the top k occurring elements, where you determine k. Another variant is using counts of frequencies, if an element appears multiple times in a list, but it doesn't sound like that is the case.
Answered by Wes on February 24, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP