TransWikia.com

Meaningful predictive analytics for small (n=114) dataset with just 1 explanatory variable and 1 response variable?

Data Science Asked by Chunky Monkey on March 10, 2021

I am given an Excel pivot table that aggregates data from a somewhat sizable data source (a database table with 1.9m records and another of about 490k). The data within the Excel file consists of 3 columns: dates of Mondays which represent their respective weeks, quantity of items, and number of shipments (that it takes for the quantity of items). I am supposed to concoct a model that predicts the number of shipments that would be required for a given quantity of items in the future. What models could I implement for such a small dataset with just 1 explanatory and 1 response variable? I know the run-of-the-mill linear regression with a confidence interval would be a start but the data has a dense cluster and then sparse data with some positive correlation. The color bar represents the date (purple is earlier, yellow is most recent)

enter image description here

One Answer

Create additional features from the given features. From the Monday "date" feature, the features ("month","year") can be found. Additionally, months of the year contain either 4 or 5 Mondays, so a "week" feature can be created with a value 1...5 that can be used to represent the week of the month that the Monday occurred on. Additional features can be created by calculating trends (Average Shipments for the previous N shipments,...etc.). This gives the algorithm more information to work with and should provide more refined results.

Answered by xChesster on March 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP