How to combine human-labelled data with user behavior data?

Question

I am working on a supervised learning problem for a web-search task, where I have access to a relatively small set of human-labeled examples and lots of user-behavior data.
Now, user behavior data is biased, because of presentation bias, position bias etc. So it's likely that its' distribution will be different from human-labeled data.
I am planning to use both to train a Neural Network model.
Now I am confused about how to combine both datasets?

Brian Spiering · Answer

That is a common scenario in a learning to rank problem. One heuristic is to separately model explicit (human-labeled) and implicit (user-behavior) features. Then combine the separate feature groups with a learned weight for their final relative contribution. Improving Web Search Ranking by Incorporating User Behavior Information by Agichtein et al goes into greater detail.
RankNet takes this approach using a neural network.

How to combine human-labelled data with user behavior data?

One Answer

Add your own answers!

Ask a Question