Data Science Asked by NeR0 on September 10, 2020
I have this dataset which I really need to use association rules techniques on. The dataset has like 90 variables, many of which are ordinal. Thing is, the data is already coded using numbers instead of strings (e.g. bread = 4 instead of "bread") as well as some re-scaled numerics such as 1 = 1%-10%".
What I have so far:
from apyori import apriori
#Convert dataframe to list
val_list= []
for row in range(1,5530):
val_list.append([str(data.values[row,column]) for column in range (0,90)])
print('Row ', row, ' ok')
apr = apriori(val_list,min_support=0.1,min_confidence=0.2,min_lift=2)
result = list(apr)
Still, this way I don’t get the feature names in the frequent "baskets" so it’s not much use, since I have baskets like [33, 1, 8, 8, 1, 1] with no idea what the numbers might be referring to. What can I do and/or how do I prepare the data for association rule mining?
Create a dictionary that contains the coded variables as keys and the item names as values.
So it would look like:
dicty = {4: "bread", 7: "milk", 9: "toothpaste"}
Constructing dictionaries in python is really easy if you have them in a table or excel spreadsheet.
dicty = {i:j for i,j in zip(coded_list,normal_list)}
where coded_list is the list of variables in numbers, and normal_list is the list of variables in their categorical names.
Once you have a dictionary you can simply convert them like this:
name = dicty[9]
and it should return toothpaste for name.
Answered by Amar Srivastava on September 10, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP