TransWikia.com

Data preprocessing for time series prediction

Data Science Asked on March 3, 2021

I have a dataset that has the following structure

[
 [
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 1
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 2
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 3
                               :
 ],
 [
  [ product 2 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 1
  [ product 2 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 2
  [ product 2 ,shelf number, position on the tray, time of stay on the shelf, was sold?], # Hour 3
                              :
 ],
                              :
]

My goal is to predict for a newer product say product_n predict if it will be sold (3 hours earlier).

My question is how do I process it for a Recurrent Neural network since the vector of prediction was sold? is available for each hour.

To say that in detail, since

[
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 1
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 2
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 3
                               :
 ],

is one observation for the RNN how do I assign was sold? to it? Since len(X) should be equal to len(y)

was sold? is available for each observations, do I take max for 3 hours and asssign it to the obervation?

Like

X = [
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 1
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 2
  [ product 1 ,shelf number, position on the tray, time of stay on the shelf], # Hour 3
                               :
 ],
and 
y = [max(was sold?)]

One Answer

So the question is about how to represent this dataset so you can use this your classification task.

Firstly, we need to group the features for each product, if you are classifying on a product-by-product basis.

Secondly, a couple of things:

Then, for the input into your model, then you simply concatenate these features together to form a "long" n-dimensional vector.

Answered by shepan6 on March 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP