Data Science Asked on May 3, 2021
I am quite new to predictive modelling but have knowledge of GIS, R, python, SQL, etc.
I am currently doing a project in work trying to predict when parking spaces will be available based on data received from a mobile phone application.
I have 2 sql tables
ParkingTickets
ParkingAreas:
Main assumption is that only app users can park in the 18 parking lots (both on-street and off-street lots (no multi store or underground)). I do not take in to account user preference, weather, events, etc., etc. This is purely presence and absence based on data at hand. I have researched techniques and was looking into using a birth/death model but struggling to find examples of it in use.
Any help or pointers on models to use would be brilliant!
sample data:
ParkingTicketId ParkingAreaId Latitude Longitude Date DurationInMinutes ExpiryDate Day
60465 302 42.56246869 -70.91313754 2014/03/07 16:36 5 2014/03/07 16:41 Friday
60466 302 42.57139883 -70.91906364 2014/03/07 16:36 23 2014/03/07 16:59 Friday
60467 302 42.54419925 -70.9417496 2014/03/07 16:36 24 2014/03/07 17:00 Friday
60472 302 42.57576595 -70.92876607 2014/03/07 16:36 16 2014/03/07 16:52 Friday
60477 302 42.55573294 -70.92912634 2014/03/07 16:36 9 2014/03/07 16:45 Friday
60479 302 42.55711998 -70.91200458 2014/03/07 16:36 19 2014/03/07 16:55 Friday
60480 302 42.58008043 -70.91559081 2014/03/07 16:37 5 2014/03/07 16:42 Friday
60485 302 42.55161223 -70.9240808 2014/03/07 16:37 21 2014/03/07 16:58 Friday
60492 302 42.58437849 -70.92764527 2014/03/07 16:37 6 2014/03/07 16:43 Friday
ParkingAreaId MaxSpaces
302 8
304 50
306 95
308 30
If this data is coming in real-time then you don't need a model -- simply check how many spots have an ExpiryDate greater than now (i.e. when you need to provide a prediction) and subtract this from the total capacity of spots.
If the data is not coming in real-time, then you could use time of day and day of week as predictors. You might even want to make them into interaction terms. You would also need to decide how often you want to call your model and group your data so that each row represents how many tickets were active during that timeframe; this will define your target variable (what you're trying to predict).
By the way, I think you're referring to a survival model. I would recommend gradient boosting instead; it's much more powerful. Gradient boosting models (GBM) are part of caret in R and scikit-learn in Python by the way.
Answered by Ryan Zotti on May 3, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP