TransWikia.com

Validation data for multi-series stateful LSTM

Data Science Asked on October 3, 2021

With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.

At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.

However, because this is stateful, I wonder if that’s not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn’t build up any prior state to use to generate the final output. Am I right in thinking this isn’t a good way to validate a stateful RNN/LSTM?

What I’m thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?

One Answer

Using a stateful model, am I correct in assuming that you have time-series data?

If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.

Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:

window_size = 100     #whatever makes sense for your data
val_size = 15

train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]

train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]

...

That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:

rolling window validation

In the image, it just has train/test sets... you could of course add validation into the mix.

If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.

Answered by n1k31t4 on October 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP