Data Science Asked by rohan23 on March 24, 2021
Let’s say I’m building an app like Uber and I want to predict the user’s most likely destination based on the user’s past history, current latitude/longitude, and time/date.
Here is the proposed architecture –
Let’s say I have a pre-trained model hosted as a service. The part I’m struggling with is, how do I get the user features from the database in realtime from the RiderID to be used by the prediction service (XGBoost Model)? I’m guessing a lookup in a SQL database will take too long, considering I have 1M+ users and rides.
Thanks in advance!
I think most likely the return of your model wouldn’t be worth it given the amount of effort to generalize it enough.
You are probably better off storing the users current location and query the most likely destination. Or simply look at all the popular trip destinations from that location. A data base with proper index should be able to handle that
Answered by The Lyrist on March 24, 2021
It sounds like you are looking for a fast and horizontally scalable database. I would advise you to use a column family database instead of a relational database for storing this kind of data. We are using Google BigTable (BT) for this in a similar use-case. On a 3 node BT cluster with SSD disks we have over 300M records that are fetched by key in 6ms @99 percentile with a load of 1000 requests per second. If the load increases you can just simply add nodes while running to your cluster or remove them. An opensource alternative like Cassandra is even faster in our experience. That database key would be RiderID in your case.
Answered by BKersbergen on March 24, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP