Cross Validated Asked on December 11, 2021
I have a dataset in which I assign descriptive-statistical data of the geographical zone to each person of the dataset (obviously, the person belongs to that specific zone). For example, in a given zone people have a certain level of education, income, interests and other info that I am able to collect at an aggregate level; so, all the people of a specific zone have the same statistical attributes.
I want to use this kind of dataset to train a binary classifier. Is it possible to achieve good performances using such data? Are there specific techniques that treat statistical data instead of precise data of people that belong to different geographical zones?
The problems that I faced using this approach are mainly the following:
I’ve tried to use Logistic Regression but it leads to poor performances, around 0.65 of AUC plotting the ROC curve. The dataset is unbalanced but it wasn’t a big deal for me since the models that I built in the past perform quite well, so I assume that the crucial point is the kind of data I assigned to each person of the dataset. Obviously, I don’t have access to precise data of the people so I can use only geographical/statistical data.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP