Dealing with time series data which has multiple observations for the same timestamp

Data Science Asked by deathcode 666 on December 5, 2020

I have a time series data in Python 3 as follows:

Date                `Weekly_Sales`
2010-05-02              3400
2010-05-02              5600
2010-05-02`             4590
2010-05-02              5800
2010-05-12              2380
2010-05-12              6700
2010-05-12              3700

The time series is not continuous as there are multiple observations of the same date.I’m trying to forecast sales in python using ARIMA but my ACF and PACF plot shows that there is no corelation between the lags.Also if i run the dickly fuller test to test stationarity,my system freezes.

How can I fix this?

2 Answers

One option is to take a Bayesian approach and model the data as a distribution of possible values that change over time. Each week would be a part of a state-space model. The most common name / framework is Bayesian structural time series (BSTS).

Answered by Brian Spiering on December 5, 2020

It looks like you have lost a bit of information in that dataset. You shouldn't have 4 measurement for one timestep for one variable - how do you know which of the first four rows to use for 2010-05-02?

I would suggest checking your data source, or then working out a way to explain the meaning of the four values... are they different somehow (using other information)?

How are you even creating lags on that Date index? Take the average over each day? Depending on the package you use for your Dickey-Fuller test (and other methods), they might not be made to deal with identical timesteps as input... so could explain why the session crashes.

Answered by n1k31t4 on December 5, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP