Unix timestamps for Recurrent Neural Networks

Question

I want to use RNN for classifying whole sequences of events, generated by website visitors. Each event has some categorical properties and a Unix timestamp:
sequence1 = [{'timestamp': 1597501183, 'some_field': 'A'}, {'timestamp': 1597681183, 'some_field': 'B'}]
sequence2 = [{'timestamp': 1596298782, 'some_field': 'B'}]
sequence3 = [{'timestamp': 1596644362, 'some_field': 'A'}, {'timestamp': 1596647951, 'some_field': 'C'}]

Unfortunately, they can't be treated as classic time series, because they're of variable length and irregular, so timestamps contain essential information and cannot be ignored. While categorical features can be one-hot encoded or made into embeddings, I'm not sure what to do with the timestamps. It doesn't look like a good idea to use them raw. I've come up with two options so far:

Subtract the minimum timestamp from every timestamp in the sequence, so that all sequences start at 0. But in this case the numbers can still be high, because the sequences run over a month.
Use offsets from previous event instead of absolute timestamps.

I'm wondering if there are common ways to deal with this? I haven't found much on this subject.

F4RZ4D · Answer

I would say offsets from previous N-event is necessary in this case. You can also encode their max and avg as other features in order to compress the irregular sampling times into a vector.

Unix timestamps for Recurrent Neural Networks

One Answer

Add your own answers!

Ask a Question