TransWikia.com

Similarity Measure of Simulated Time Series vs Observed time Series

Data Science Asked on July 24, 2021

In my work I have an observed Time Series and Simulated ones. I want to compare the Light Curves and check for similarityto find out which simulated curve fits best respectivley which parameters simulate the Light Curve the best.

At the moment I do it with the Cross-Correlation function from numpy. But I am not sure if that is the best option, due to the fact that the Light Curve with the highest Cross-Correlation Coefficient not always looks like the best fit/simulation compared to other simulations with a lower CC-Coefficient. Is there a another way to measure for similarity? I read something about the Chi-Square Statistics, but I am not sure how that works and how this could be applied to my problem.

The observation data I use is not evenly binned, so I used the interpolation function of Scipy. Should I also smooth the observation data or would I lose true features of my data. I thought about using the savitzky golay smoothing.

At the moment I am using a brute force method to try out all possible parameters and simulated the corresponding light curve. The problem is this takes a lot of time with 20 parameters. The parameters are more or less dependent on each other. So I cant use a least-square fit method, because there are multiple possible minimas. Is there a simple method that I overlooked. Or is a restricted brute force fit my best option?

In the picture below you’ll see one plot with the simulation and the observaton data.
Red: Simulation, Blue: Observation
Thanks for all suggestions.

2 Answers

in my opinion you can do the following to compare your simulated with your actual data.

Use usual measurements like those you would use in predictions vs actual data:

  • RMSE
  • MAE (Mean Absolut Error)
  • MSE (Mean Squared Error) --> Gives greater weight to larger errors/gaps

Another statistical test that comes to my mind to test "prediction" quality between different models is the Diebold-Mariano test. It is fully implemented in R. As far as I know, not in any Python Library so far.

As a statistical test you could have a look at the distribution (but as you dealing with time-series you may face "memory", meaning that observations t and t+1 face autocorrelation (against most of statistical tests, as they assume independent observations). Autocorrelation seems to be the case for your data series. Therefore you should have a look on the % change of your sample data (from t to t1) and look if both series are drawn from the same distirbution - You can perform the kolmogorov smirnov test.

Hope this helps.

EDIT

Regarding your question concerning smoothing and interpolation. Empiricis (not statistics) is more a trial and error concept. It really depends on your data behavior.

Answered by Maeaex1 on July 24, 2021

It looks like a signal processing question, where you want to denoise a periodic signal. May I suggest you to look at specific litterature ?

More specifically : https://github.com/unpingco/Python-for-Signal-Processing, the Signal Processing Reading List. You'll learn how to model periodic signals, denoise them and compress them.

Answered by lcrmorin on July 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP