Software Recommendations Asked by Silverspur on November 20, 2021
The running of an industrial process I’m in charge of produces time series data (approximately several thousand parameters, each 1/8th of a second, during a month).
My goal is to look at two month-long executions and compare the data; something like, for each parameter or group of parameters, max diff, mean diff, Generalized Least Squares regression, and other standard statistical metrics.
I’ve succeeded in prototyping the ingestion of data and producing a basic visualization using the ELK stack.
But as I’m moving forward, I’m wondering: sure ELK can store my data and offer some nice visualization, but is it really a tool to produce such time series analysis/comparison/statistics? What would be the best software/stack to turn to here?
The commonly-used scripting languages include SQL, Python Pandas and esProc.
The SQL query is quite simple when using the window function:
select transDate,price,
price/lag(price) over(order by transDate)-1 comp
from stock1001
But the code will be roundabout if the SQL product doesn’t support window functions.
As a library intended specifically for structured data computations, Pandas handles order-based calculations effortlessly. To calculate link relative ratio, for instance, Pandas query is as follows:
import pandas as pd
stock1001=pd.read_csv(‘d:/stock1001.csv’) #return as a DataFrame
stock1001 [‘comp’] = stock1001.math/ stock1001.shift(1).math-1
But in these more complicated scenarios, Pandas also turns to some difficult tricks. It makes the code hard to write and understand.
esProc handles simple order-based calculations effortlessly. Here’s the esProc query for calculating link relative ratio:
A1=file("d:/stock1001.csv").import@tc()
A2=A1.derive(price/price[-1]-1:comp)
I wrote a comparative article about them, you can refer to the details
Answered by Elsa Jessica on November 20, 2021
I would suggest taking a look at pandas going forwards as it excels at this sort of task. It is free, gratis & Open Source plus cross-platform. It is a tool within the Python ecosystem so you will require a recent copy of Python on your system, it also free, etc., this can also be installed from the Anaconda distribution of Python.
The excellent Python Data Science Handbook by Jake VanderPlas has a section on time series which gives the example of comparing closing stock market prices for a company (Google in this case) over time. It includes various visualisation techniques and the possibility of time shifting data so that you could, for example, take the values grouped by year with the data time shifted to align the first Monday of each year.
Answered by Steve Barnes on November 20, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP