Data Science Asked by raeldor on April 11, 2021
I’ve been following this time series tutorial for Tensorflow…
https://www.tensorflow.org/tutorials/structured_data/time_series
And it was going good, and seemed to work ok. I substituted with my own dataset (about 1.1m records with 4 features) and it also seemed to work well, but memory was getting REALLY tight, so I thought I would try and implement the improvement mentioned at the bottom that said…
In addition, you may also write a generator to yield data (instead of
the uni/multivariate_data function), which would be more memory
efficient. You may also check out this time series windowing guide and
use it in this tutorial.
This kind of made sense, as it seems to produce a sliding time window with single steps so the data becomes HUGE. After reading through this…
https://www.tensorflow.org/guide/data#time_series_windowing
It looked like I could use tensor flow to do the windowing, which I assumed would get executed dynamically therefore not much memory would be used. After reading, I came up with the following code to try and replace the multivariate_data function (functions are borrowed from the tutorial, though I changed the dense_1_step to get a single set of label features back instead of a window…
def make_window_dataset(ds, window_size=5, shift=1, stride=1):
windows = ds.window(window_size, shift=shift, stride=stride)
def sub_to_batch(sub):
return sub.batch(window_size, drop_remainder=True)
windows = windows.flat_map(sub_to_batch)
return windows
def dense_1_step(batch):
# Shift features and labels one step relative to each other.
return batch[:-1], batch[-1:]
# get training samples (features and labels)
train_ds = make_window_dataset(tf.data.Dataset.from_tensor_slices(dataset[:TRAIN_SPLIT]), window_size=past_history+1, shift = 1, stride=1)
dense_labels_train_ds = train_ds.map(dense_1_step)
# get validation samples (features and labels)
val_ds = make_window_dataset(tf.data.Dataset.from_tensor_slices(dataset[TRAIN_SPLIT:]), window_size=past_history+1, shift = 1, stride=1)
dense_labels_val_ds = val_ds.map(dense_1_step)
# batch and shuffle training data
train_data_single = dense_labels_train_ds.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
val_data_single = dense_labels_val_ds.batch(BATCH_SIZE).repeat()
But for some reason it’s all gone pear shaped for the following reasons…
I’m sure I screwed something up here, but to me it looks like it’s doing what it should be doing. I’m still on a steep learning curve with this, but if anyone with more experience could point me in the right direction I would be very grateful.
Thanks
Ray
It seems I was returning multiple features as labels. I had to modify the dense_1_step function to return a single feature...
def dense_1_step(batch):
# Shift features and labels one step relative to each other.
return batch[:-1], batch[-1:,1][0] # take second feature only
To make it the same as the output from the multivariate_data function.
Answered by raeldor on April 11, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP