Cross Validated Asked by henry50618 on November 21, 2021
I have three data sets, each including a subset of some features.
For example, dataset 1 have feature A and feature B. dataset 2 have feature B and feature C. dataset 3 have feature A and feature C.
I would like to find the overall feature importance of A, B, C. Can I do the following procedure?
(1) Find feature importance from three datasets separately (Using dominance analysis or pls-sem)
dataset1 -> A : 50%, B : 50%
dataset2 -> B : 25%, C : 75%
dataset3 -> A : 40%, C : 60%
(2) weighting feature importance by the sample number for each dataset:
sample number in dataset 1 is 4000
sample number in dataset 2 is 5000
sample number in dataset 3 is 6000
feature importance of A = 50%(4000/15000) + 40%(6000/15000) = 29.3%
feature importance of B = 50%(4000/15000) + 25%(5000/15000) = 21.7%
feature importance of C = 75%(5000/15000) + 60%(6000/15000) = 49%
I am not sure whether this procedure is reasonable. Can anyone give me some advice?
Really appreciate.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP