Why don't companies use on-device ML to serve ads to protect privacy?

Question

Eg: The ad inventory can be stored in the cloud. Using on device ML, a process then matches the ad to the user in order to serve the best ad to the right user.

Are there any limitations to this? (I am planning to make an app that serves ads using TFLite and CoreML so the user privacy could be protected.)
Can models in TFLite be updated with user data on-device, helping models stay relevant to user behavior without compromising privacy like Apple's CoreML?

Tim · Answer

There are several problems:

The problem with serving ads on server side is that they process information about the users, so it can interfere with their privacy, or even break some privacy laws in some cases.
The suggestion to move the algorithms to client side (e.g. mobile app) solves this issue, but produces the opposite problem: now you need to upload all the data about the adds to client-side app. This may be problematic because this could expose the corporate secrets of your company (who are your customers, what kind of adds they want to show to users, the target group for the ads). Your customers, who paid for the ads may not like it. Your competitors may potentially abuse this to gain advantage over you.

Another problem is that serving ads is about matching users to ads. On client side you could only match ads to user. Imagine that your customer pays you for showing 10K ads to the users, if you assigned ads on user side, you would not have direct control on showing exactly 10K ads, so you could loose money by showing the ads that were not paid for. Sure, there are ways around this, but this is a problem that needs consideration.
In real-life this gets even more complicated, because the ads are usually served in terms of betting (who pays more, gets their ad shown), where different customers bet to make ads and algorithm decides about most optimal share, such that it maximizes the gain from showing the ads. In such case, not only you don't have control over global gain, since you are making only local optimization decisions, but you would need to send the bets to client-side app, so again share sensitive information about your customers (how much do they pay for the ads).

Third thing is that even if the client-side app made a decision, it would probably still need to query ads database to download the user-matching ad. I guess, you could backward-engineer this to learn the characteristics of the user given the ads that were assigned to them by the algorithm, so in the end, this might not protect privacy.
As alternative, you could send all the ads to user and store them in the app database, but then you would need user to store potentially large amounts of your data just that you could show them your ads. I doubt users would like this.

Finally, I'm not sure if ad selling businesses is the most concerned about user privacy, i.e. if they would care.

ncasas · Answer

Yes, this can be done.
Nevertheless, there are some aspects to be taken into account:

Information isolation: sensitive information should be isolated as close as possible to its source, both for the user and for the ad provider company. Therefore, you should not send raw user data to the server-side, nor send advertising company information to the client-side (note that other answers to this question seem to assume that moving the privacy-sensitive computation to the device implies that all the processing is done on the device, but this is not necessarily true). You could compute the user sensitive part on the device, e.g. compute a "user representation", and then send the result to the server-side, for the rest of the computation and ad assignment to be done. The "user representation".

Device energy consumption: running a heavy computation on-device may incur increased battery consumption, which could lead to bad user perception and subsequent ad revenue decrease.

Finally, it is possible to update a TFLite model. Depending on how you use TFLite, you could host your model on your server and download it from the client if the timestamp of the on-device version if older than the last available one, and load the model file as usual. If you use Firebase MLkit, it supports hosted models likewise.

Why don't companies use on-device ML to serve ads to protect privacy?

2 Answers

Add your own answers!

Ask a Question