Data Science Asked by user73749 on December 18, 2020
I’d like to ask about a case when we would like to predict the best class of some input variable, so that the probability of event will be maximized.
For example the advertisement type for a given customer, which will maximize the probability of purchase.
In collected data we have many various ad types, customer descriptors and informations if the purchase was made or not.
One solution which comes to my mind is to treat the ad type as an input variable, train regular probability model, then make predictions with all ad type configuration and then pick one giving the best estimation.
What are the other options?
If you want to predict the best class of input variable, why don't you take the input variable itself as the output? And say the probability or a boolean of whether the event is happening or not as a input. I guess with enough training data, it may predict right. But that might not be an ideal solution, since you already specify the model the best class for which the event happens, so your approach is a better one.
Answered by Srihari on December 18, 2020
As you are looking for a conditional distribution of a variable given another one, graphical models come to mind.
Answered by Yohan F on December 18, 2020
In your method, you would build a model which maps the input features (ad type, customer descriptors) to the output (purchase made = 0/1). You can use a model such as a logistic regression, or a decision tree (which can model the interaction effects between the ad type and the customer descriptors.) This seems to be reasonable and is one of the approaches commonly used for ad selection.
A few other options based on customer similarity:
Near-neighbour based approach: Based on the customer descriptors, identify similar neighbours of the current customer. Compute the probability of conversion for each ad type for all the neighbours put together, and pick the ad type which has the highest probability of conversion.
Cluster based approach: Form clusters of customers based on the customer descriptors. Map the current customer into one of the clusters, and pick the ad type which maximizes conversion probability for that cluster.
With respect to designing a system of ad selection using such a model, it is also important for the system to explore new regions of the space, so some small randomization of predictions will help here.
Answered by raghu on December 18, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP